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RANDOMIZED BRANCH SAMPLING FOR ESTIMATING 
FRUIT NUMBER 


S. C. Pearce anp D. A. Hotianp 
East Malling Research Station, Kent, England 


A method called “randomized branch sampling” has recently been 
advanced (Jessen [1955]) for determining the number of fruits on a 
tree. In this paper its usefulness will be considered, especially in 
comparison with a long-established method (Hoblyn [1931]) available 
for complete enumeration and readily adapted to sampling. 

The particular form of randomized branch sampling recommended 
was that denoted by the symbol PPA, 7.e., ‘probabilities proportional 
to area’. The recorder starts at the trunk and measures the girths of 
the main branches; he calculates their squares and selects a branch 
randomly, the probability of selection being proportional to basal area. 
He follows up the branch chosen to the next fork, where he repeats the 
process, and so on. After a number of such selections he arrives at a 
unit of the tree upon which he counts the fruit, and an estimate of 
fruit number for the whole tree is then calculated. 

Such a method of sampling can be judged only in relation to the 
purpose of the investigation and the species concerned. Thus, the pur- 
pose may be the estimation of the crop on only the particular tree or 
set of trees being sampled, as in an experiment, or the trees may them- 
selves constitute a sample, as when a survey is made of crops in a large 
area; different standards of accuracy might be required in the various 
cases. 

Again, with some species the branches fork more frequently than 
with others, and there may be differences in ease of access to the forks 
for measurement, and there may be varying chances of causing damage 
while doing so. Also, some species usually bear their fruit on new wood, 
some on old, and some on both. Consequently it is unlikely that any 
one method can be used in all instances without modification. 

The claim is made that the procedure recommended is operationally 
simple. No doubt it would be possible to minimize the amount of 
arithmetic by using a nomogram or table to give a’ /(a’ + 0°) from 
a and b, and a selection of random numbers could be written down 
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beforehand. Nevertheless, the experience of the present writers with 
a range of European fruit tree species leads them to question this 
operational simplicity, at least as a general rule. Also, the result. of 
sampling is not readily checked unless marks are made on the tree to 
show the result of each random selection. 

However, even if practicable, the method does not appear to be 
very accurate. Data are given from only one tree, a pineapple orange 
in Florida in 1953, with results that are rather discouraging. This 
tree had two main branches, one of which had three sub-branches 
and the other had two. When these five sub-branches were taken 
successively as units for counting, it was found that they severally 
gave estimates of the total number of fruit on the tree as follows: 
1696, 903, 874, 1701, 1171, the true figure being 1379 and the variance 
of the error of estimation 128,545. For some purposes this would not 
be accurate enough. Thus, the coefficient of variability of crops of 
mature trees is commonly about 30-40 per cent and may be less (Hoblyn 
[1931]), so for purposes of an experiment with one-tree plots it is desirable 
that the standard error introduced by sampling shall not exceed 10 
per cent of the mean if its effect on the overall accuracy is to be negligible, 
7.e., the variance needs to be less than about 19,000. 

Some interest attaches to the assumption that the number of fruit 
on a branch is proportional to the square of the branch girth. ‘The 
question does not appear to have been considered previously; though 
Wilcox [1940-41], working with apples, has suggested that the weight 
of fruit on a tree varies as (trunk girth)"*°. On general grounds, how- 
ever, it is to be expected that the quantity of fruit would be directly 
related to the weight of the branch, and unpublished work at East 
Malling on apples suggests that this varies as the branch girth raised 
to the power 2.8 or 3.0, which supports the findings of Sudds and 
Anthony [1929]. Certainly, for the tree from which data are available, 
substitution of cubes for squares leads to the estimates: 1625, 1117, 
1164, 1427, 1194, which are better, the variance being now 37,931. 
However, the use of the fourth power is yet better, the estimates becom- 
ing 1587, 1378, 1635, 1235, 1243, and the variance 26,994, but even these 
results are hardly good enough for the purpose mentioned, while the 
method appears arbitrary. 

Although the method will be unbiassed whatever power is used, 
it is desirable to adopt a value that is justified biologically and this will 
on the average lead to the greatest accuracy. In the absence of a 
physiological basis the choice of power must be made empirically, but 
to do this justifiably would need an extensive study. A figure should 
not be accepted on the basis of a few data unless there were convincing 
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prior reasons for it. Judging from experience with the analogous 
problem of relating tree weight to girth (Pearce [1952]), it is not certain 
that any one power will be generally valid. 

From the data presented there is little hope that precision would be 
improved by taking selection beyond the sub-branches and counting 
more units keeping the total sample about one-fifth of the tree; while 
the work of measuring girths, raising to a power and randomizing would 
be ancreased. Improvement would therefore have to come from in- 
creasing the sample size. 

In this connection, it is worth recalling the method proposed some 
quarter-century ago for measuring the one-year-old shoots on trees 
(Hoblyn [1931]). This makes use of the fact that, at any fork, one 
branch must morphologically be more distal than the other, and consists 
simply of recording the more distal one first. This method has been 
found to work very satisfactorily for a number of species over many 
years not only for measuring shoots, incidence of disease, etc, but for 
counting fruits, blossoms, and leaves. For these last purposes a ‘‘clicker- 
counter” is invaluable. This method can be so rapid that in some in- 
stances no sampling method appears to be called for. 

However, if one is needed, Hoblyn’s method of enumeration, which 
in effect arranges shoots in an objective order, readily provides one, for 
it is often sufficient to observe every nth shoot instead of all of them. 
Such a method could reasonably be expected to be better than ‘““PPA 
with sub-branches,” because, according to Jessen’s table, this is about 
as efficient as “PH with smallest branches’, and the proposed method 
should be better than this for a given sample size, partly because it 
uses smaller sampling units and partly because it disperses the sample 
over the tree instead of taking it at random. A difficulty in testing this 
suggestion from Jessen’s data arises from his giving no indication which 
branch at a fork is more distal, this information not being required by 
his method. Accordingly, ten examples have been considered, all based 
on the orange tree for which Jessen published his data. In five examples, 
the branches at each fork were given a random order, the first: being 
accounted the most distal; in the others, the branches were given an 
order at random only where none was suggested by subsequent branch- 
ings as being botanically likely. For the completely random examples 
variances for n = 3, 5, and 10 were respectively 17,770, 15,005, and 
14,502; for the Pci random ones the Paniespouding figures were 
113,120, 13,355, and 4,938. On account of the lack of data these 
figures should not be regarded as very reliable; but they are quite 
encouraging. It may be noted that Jessen’s ‘smallest branches” are 
larger units than one-year shoots, and the latter should therefore give 
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more accurate estimates. On the other hand, some unit larger than 
one-year shoots might be preferable if fruit were partly borne on old 
wood. With Jessen’s method, about one-fifth of the fruits must be 
counted and certain branch girths measured; with the method suggested 
here, for n equal to five the amount of counting will be about the same, 
no branch girths will be needed, but the whole tree will have to be sur- 
veyed, so the time spent will be about the same or perhaps less. The 
evidence suggests that accuracy will be better. Nevertheless, it 1s not 
clear what is the best method of recording; this is regrettable in view 
of the importance of the subject, and further work is needed. Until 
more is known the writers would prefer to keep to Hoblyn’s method of 
complete enumeration, or, if that were too laborious, they would count 
the fruit on every n-th shoot. 


SUMMARY 


The method of randomised branch sampling for fruit number is 
examined and compared with some alternatives. 
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THE ERROR OF REPLICATED POTENCY ESTIMATES IN 
A BIOLOGICAL ASSAY METHOD OF THE PARALLEL LINE 
yp is 


MInpvEt C. SHeps AND Paut L. Munson 


Department of Biostatistics, School of Public Health, Biological Research Laboratories 
and Departments of Preventive Medicine and Pharmacology, School of Dental 
Medicine and Medical School, Harvard University, Boston, Mass., U.S.A. 


INTRODUCTION 


In the practice of biological assay the most widely used formulas 
(2, 3] for the variance and the fiducial limits of a potency estimate take 
account of intra-assay error only. The assumption implicit in the use 
of these formulas that inter-assay variation need not be considered 
may be tested by comparing the variation predicted by the formula 
with the variation actually observed in replicated assays of unknowns. 
Such a test in a collaborative penicillin assay of the parallel line type 
analyzed by Bliss [4] showed that the actual error of the log potency 
estimate (M/) was considerably and significantly greater than the error 
predicted from intra-assay statistics. It was suggested that a large 
portion of the discrepancy might be due to errors in dosage of test 
substances. <A similar suggestion was made by Dews and Berkson [5] 
in their consideration of the actual error of quantal assays. Excess 
variability in M occurring during collaborative trials for International 
Standards of several antibiotics [6-10] was at times associated with, 
although not fully explained by, significant deviations from linearity 
or from parallelism. Although many of the observed discrepancies 
were small in an absolute sense, they were detected because of the 
considerable innate (within-group) precision of the assay methods. 
The actual error of M for several other International Standards [11-13] 
was also greater than predicted by the usual computations. Inter- 
actions of M with replicate samples of the same milk within assays and 
also between assays were observed by Clarke [14] in parallel line assays 


for riboflavin potency. 


*Presented in part before the Eighth and Ninth (1) Annual Meetings of the Biometric Society, 
Eastern North American Region, New York, December 27, 1955 and Detroit, September 10, 1956. 

This investigation was supported in part by research grants (C-2056 and C-2191) from the National 
Cancer Institute of the National Institutes of Health, Public Health Service. 
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These results and others like them [15-17] from both quantal and 
graded-response assay methods suggest that the occurrence of hetero- 
geneity in a group of replicate M’s may be due not only to occasional 
accidental gross errors in procedure, or to basic non-validity of the 
assay [18, 19], but also to random experimental variation. In such 
cases, the sources of inter-assay error must be sought, and measures 
taken to offset them. If inter-assay variation can be eradicated or 
progressively diminished by improvements in the method, the internal 
estimate of error is approached as a minimum. However, as long as 
inter-assay error is present, the precision of the method cannot be 
estimated adequately by the internal statistics alone and an accurate 
estimate of the total error of M including the inter-assay variation is 
needed. 

We have investigated this problem in several rather large series of 
routinely performed parallel line assays in which many unknowns were 
tested repeatedly. The most extensive data were obtained from a 
chick comb weight assay method for androgens as performed over a 
period of about three years. A smaller series of replicate J/’s from a 
parathyroid assay method have also been analyzed. 


SOURCE AND SELECTION OF ANGROGEN ASSAY DATA 
Assay Procedure 


The initial procedure for the androgen assays was based on the 
method of Rakoff et al. [20] with several modifications [21, 22]. Our 
experiments on the assay method have led to five successive modifica- 
tions in procedure resulting in five series of similar assays. Series I 
and II were conducted with ether as the vehicle for androgen. They 
differed from each other in the volume of ether solution applied per 
dose and in the dosage of androsterone, the reference standard (Table 1). 
Series III was a group of experimental assays that will not be considered 
here. Ethanol was substituted for the more volatile ether as the vehicle 
for androgen in Series IV and V, which were distinguished from each 
other only by the dosage of androsterone administered. 

Each assay included two or three dose levels of androsterone, and 
several unknowns at one or two dose levels. The unknowns were either 
extracts of human urine collected in the course of several clinical endo- 
crine studies [21, 23-25], or synthetic steroids chemically different from 
androsterone. Each unknown was assayed at least twice. More than 
two tests were conducted when an entire assay was invalid, when there 
was notable disagreement between the first two potency estimates, or 
for increased precision. The time interval between the replicated 
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assays of a given unknown varied, and naturally tended to be longer 
for unknowns with more than two replications. 


TABLE 1 


CHARACTERISTICS OF THE AssAy METHOD FOR ANDROGENS IN Four SERIES 
or Assays VARYING IN DESIGN AND TECHNIQUE 


Series I | Series II | Series IV | Series V 
Number of assays 13 61 28 16 
Vehicle for the androgen Ether Ether Ethanol | Ethanol 
Volume of solution used per 
application 0.05 ml. | 0.01 ml. | 0.01 ml. | 0.01 ml. 
Range in daily dose for the standard] 0.1-1.6yg. | 0.05-0.8yg.| 0.05-0.8ug.| 0.1-0.8yg. 
Approximate number of chicks for 
entire assay 100 100 200 200 
for the standard 30 30 40 40 
Mean for the standard slope (6,) 24.2 24.2 24.2 30.8 
Inter-assay variance of standard 
slope (observed) V(b.) 36.6 30.0 15.4 18.4 
Variance of standard slope estimated 
from intra-assay statistics 15 16 7 12 
Mean residual variance per response 105 118 95 94 
Mean X(5/b) 0.42 0.45 0.40 0.32 
Median value of g in random sample 
of assays 0.04 0.04 0.01 0.01 


Criteria for Selection of Data 


All replicated assays that met certain definite criteria were included 
in this report. Data were used only from assays in which the responses 
to the reference standard were linear, with a slope lying within the 
99% tolerance limits of the mean slope for standard in all assays of 
the series. Furthermore, within these acceptable assays, only those 
potency estimates were used that were based on responses parallel to 
the standard curve (P > .01). The distribution of 279 replicate M’s 
that satisfied these criteria is shown in Table 2.* 


Comparison of the Assay Series 


The effect of the androgen on the chick comb, that is the assay 
response, was expressed as the coded logarithm of the ratio of the comb 


*Tabulations of the replicate M’s are not included in this report, but mimeographed copies of 
these data are available and will be supplied on request. 
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TABLE 2 
Disrrrpution oF Repiicate Loc Potency Estimates FROM ANDROGEN ASSAYS 
Assay series Assay series 
T and II IV and V All 
Urine Urine Synthetic 
extracts extracts steroids 
Unknowns with: 
Two estimates 60 28 18 106 
Three estimates 11 0 4 15 
Four estimates 4 0 0 4 
Six estimates 0 0 1 1 
Total unknowns 75 28 23 126 
Number of assays yielding 
replicates 62 16 18 96 
Number of replicate estimates 169 56 54 279 


weight to the body weight. The procedural and statistical character- 
istics of all the assays from Series I, II, IV and V are summarized 
separately for each series in Table 1. The intra-assay statistics from 
the first two series (ether assays) were similar. Although the index of 
precision (A = s/b) was undesirably high, it was possible to compensate 
for it partially by the use, in each assay, of approximately 30 chicks 
for the standard and 10-20 chicks for each unknown. 

In Series IV and V (ethanol assays) the number of chicks for andros- 
terone was increased to approximately 40. The assays in these series 
were characterized by a slight reduction in residual error variance and 
an increased slope of the regression line, resulting in a moderate (25-30%) 
improvement in the index of precision, A. Altogether, about 16,000 
chicks were used in the assays reported here. 

It would have been difficult to compare the combined slopes (b,) 
from the individual assays because they included various combinations, 
at various dose levels, of some of the unknowns under consideration 
here with other unknowns. Instead, the standard slopes (b,) were 
compared, since the dose levels and the number of chicks per dose level 
of the standard were essentially uniform throughout each series. Ob- 
viously, b, for any assay was not greatly different from b, based on 
parallel response curves. The mean (b,) and the observed variance 
(V(b,)) of the standard slope were therefore calculated for each series 
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and are shown in Table 1. The predicted variance of b, calculated from 
the average intra-assay statistics for each series (Table 1) was only 
two-thirds to one-half of the observed variance, a significant under- 
estimate in most cases. In addition to the heterogeneity of the slopes, 
significant heterogeneity of assay residual variances (s”), each with a 
large number of degrees of freedom, was found in each series by Bart- 
lett’s test. 


DERIVATION OF M’S AND THEIR PREDICTED VARIANCES 


M for each unknown was calculated on the basis of the combined 
slope (b.) computed from all parallel responses in an assay. The pre- 
dicted variance V(M) was calculated by Bliss’s formula, with Bliss’s 
notation (3): 


sf @,= ay 
VO) = ae = 7 + an (1) 
The statistic g = s°f’/B’ was calculated for a random sample of 


assays from each series using all parallel slopes in each assay. In most 
cases it was under .05 (Table 1), which is negligible by the usual criteria. 
The calculation of exact fiducial limits, using g, would thus be com- 
parable to an expansion of V(M) by about 5% at the most. Since this 
small change would have little effect on the analysis, it was ignored. 


Replicate M’s from Series I and II 


One hundred sixty nine replicate M’s for 75 urine extracts were 
available from ten assays in Series I and 52 assays in Series II. M and 
its variance estimated as described will be denoted as M, and V(M,) 
for these replicates. In an attempt to assess the effect of the slope 
variability on the estimates, a second set of M’s was computed, based 
on the mean standard slope (6,), as M, = @, — @, + (Gj. — 9.)/b. 
Not unexpectedly, the individual M,’s differed considerably in some 
cases from their counterparts (/,) based on assay slopes, but these 
differences were not systematic. A second set of variances V(M,) 
was also computed, in which V(b,), the observed standard slope variance, 
was included directly. The formula used was: 


ie real 1 (Gj. — G2) V(b.) 
OO ais { (¥- a x) FUSE TERY } @) 


where all symbols have the meanings previously assigned, and /, is 
the tabular “‘?’’ value at P .05 for V(b,). The second term inside the 
curly brackets is equivalent to Bliss’s (gj, — 9.)°s2/(B” — Satz) [8, 
p. 611]. 
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The use of Equation 2 rests on two assumptions: that the mean 
difference (7, — J.) is subject to the assay variance, and that M is 
also affected by the slope by assay interaction. The second assumption 
would necessitate modification of the model used by Finney [26] in 
his proof that M is unaffected by the slope variance, which is discussed 
below. The calculation of exact fiducial limits on the basis of these 
assumptions would be tantamount to an inflation of V(MZ,) by about 
40% for M’s from Series I and 26% for M’s from Series II. As will 
be apparent from the data presented below, the adjustment would 
lessen but not completely remove the discrepancies between the observed 
and predicted error of M. 


Replicate M’s from Assay Series IV and V 


From the assays in Series IV and V, replicate M’s were available for 
two categories of unknowns: 56 duplicate estimates for 28 urine extracts, 
and 54 replicate estimates for 23 synthetic steroids (Table 2). They 
were derived from 21 assays in Series IV and 13 in Series V, but extracts 
and steroids were tested in different assays in the two series. MJ and 
V(M) for these unknowns were computed with intra-assay statistics 
only. 


NO. OF 
x? VALUES 
OBSERVED 


IP (Lis WE 


DISTRIBUTION OF X® VALUES FOR ASSAY SLOPE M,'S 
BY PROBABILITY (P) LEVEL 


FIGURE 1 
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COMPARISON OF OBSERVED AND PREDICTED ERROR 


The mean observed variance of the M’s was computed from the 
sum of the squared deviations of replicate J/’s from their own mean. 
The mean observed and predicted variances and related statistics are 
compared in Table 3, for estimates from Series I and II, as computed 
both for M, and for WM, . In both cases the difference between the 
observed and the predicted variance was of considerable magnitude 
and highly significant (P < .001), leading to the conclusion that, for 
these data, the predicted error was a gross underestimate of the actual 
error. 


TABLE 3 
VARIANCE OF 169 RepricateD Loe Pormency Estimates (1) or 75 UnKNowns 
FOR ANDROGEN ASSAYS IN SERIES I AnD II, As ComputrEep sy Two Meruops 


Method of computation 
Assay slopes Mean standard 
(b-) slope (bs) 
Mean observed variance of MW 0.0894 0.0769 
Mean predicted variance of Af 0.0281 0.0333 
7 Mean observed bia 743 2.31 
Mean predicted variance 
P(F) < .001 < .001 
Percent standard error of F (antilog 
s.e.3,-1) 100 
Observed 99 89 
Predicted 47 52 
Number of unknowns heterogeneous (P < .05) 13 12 
Percent of unknowns heterogeneous (P < .05) gfe 16.0 


The observed and predicted variances were compared for each un- 
known individually by the X° test [3, 19]. Figure 1 shows the distribu- 
tion of the probability values (P) corresponding to the values of X* 
obtained for the M,’s. The corresponding distribution (not shown) 
for the M,’s was similar, although the X°’s for individual unknowns 
were not identical for the two series of M’s. The discrepancies from 
the expected distribution of probability values were highly significant 
(P < .002) in both sets of X*, and the P .05 level was exceeded in more 
than 16% of the unknowns. The total distributions were characterized 
by a relative scarcity of low values and a surplus of high values. 
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Although the use of 6, and V(b,) in the computations narrowed the 
gap between the predicted and observed variances of M, the differences 
from the intra-assay results were not statistically significant. There 
was no convincing evidence that the use of 6, and V(b,) improved the 
accuracy of the predictions for individual unknowns. 

In Series IV and V, the observed variance for urine extracts was 
almost exactly as predicted (Table 4). For synthetic steroids the pre- 
dicted error of M was greater than for the urine extracts, mainly because 
the first M for a synthetic was usually based on a single dose. However, 
this relatively large predicted error significantly (P < .05) under- 
estimated the observed variance in a ratio of 1:1.6. 


TABLE 4 


VARIANCE OF 110 RepuicaTre M’s ror 51 UNKNOWNS FROM ANDROGEN ASSAYS 
IN SERIES IV anp V 


Urine Synthetic 
3 extracts steroids 
Number of unknowns 28 23 
Number of M’s 56 54 
Mean observed variance of 0.0127 OR02 77 
Mean predicted variance of Mf 0.0121 0.0179 
_ Mean observed variance 
~ Mean predicted variance 1 pe 
P(F) Ss << U 
Percent standard error of R (antilog 
8.€.,7-1)100 
Observed 29.6 46.7 
Predicted 28.8 36.1 
Number of unknowns heterogeneous (P < .05) 2 3 
Percent of unknowns heterogeneous (P < .05) fot 13.0 


DISCUSSION OF THE VARIATION IN REPLICATE M’S 


Although in the ethanol assays the predicted error of M was reduced, 
the marked improvement in inter-assay agreement could not have been 
anticipated from the intra-assay statistics (Table 1) or from comparisons 
among assays on a control chart. In the ethanol series, moreover, 
while the statistics provided accurate estimates of the precision of 
M for urine extracts, the predictions for synthetic steroids were inade- 
quate. The findings strongly suggest that experimental error associated 
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with the administration of androgens in ether was the major factor 
behind the earlier results. However, the persistence of appreciable 
inter-assay error for steroid estimates indicates a need for continued 
investigation. 

Systematic change with time was eliminated as a predominant 
cause of discrepancies by the failure to find any consistent trends. Cage 
differences may have contributed to the error since unknowns were not 
replicated over cages within an assay. However, several experiments 
with both ether and ethanol as solvents for the reference standard 
showed no significant cage differences, nor did duplicate cage means 
from 24 assays in Series IV (Table 5). It therefore seems unlikely 
that cage differences were an important source of error. 


TABLE 5 


ANALYSIS OF VARIANCE OF CoDED MEAN REsponsEs TO ANDROSTERONE 
FROM 24 Assays IN Surigs IV 


In each assay, values for duplicate cage means of 11 chicks were assigned at random 
to “standard” or “unknown” group. 


Source Dis M.s. 
Assays 23 West) 
Regression on dose 1 22,482. 7*** 
Between “substances” 24 8.8 
Assay X regression 23 ME Eat 
“Substance” XX regression 24 11.8 
Within cage 970 8.5 
Mean M 1.9995 


Observed variance of M | 0.0137 
Predicted variance of M | 0.0085-0.0339 


SATE! VI 


Lack of similarity in comb response to unknowns and to the standard 
androsterone may be a factor in the production of inter-assay error. 
Analysis of the data from all assays that met the previously outlined 
criteria for validity, did not produce suggestive evidence that non- 
identity or heterogeneity played an important role in the excess variation. 
However, further study of the possibility seemed indicated. 

A related, more statistical issue considered was whether one could 
detect the presence of appreciable inter-assay error in M from direct 
assay analyses without such calculations on replicate estimates as were 
reported here. This would depend on identifying intra- or inter-assay 
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variation associated with variation in M, to provide a basis for estimating 
its variance as a more reliable indication of the actual error. The use 
of interactions of assays with treatments, slope and other effects as the 
error term would seem to have been ruled out by Finney [26] in the 
case of biological assays. He showed that in replicated analytical 
dilution assays differences in response levels are perfectly correlated 
with slope so that the within-group s° is the only suitable basis for 
estimating the error of Mf. The applicability of Finney’s model to all 
biological assays has been questioned by Bliss [8, 27]. 


Experimental Trial 


Information on some of these questions was sought in an experiment 
involving three substances: the standard androsterone (S), a chemically 
distinct androgen, methyl testosterone (MT), and an extract (125) 
of a urine specimen obtained from a clinically normal young man. 
Since the predominant androgen in normal urines is androsterone it was 
expected that the responses to £25 and to S would be similar. Four 
dose levels of each substance were administered and replicated over 
four assays. Significant inter-assay slope variation and_ possibly 
additional interactions were anticipated. The main purposes were: 


1) to see whether the three distinct substances would evoke different 
types of responses and whether 1/7 would be different from the other 
two; 

2) to test the applicability of inney’s model to this situation; 

3) to estimate the intra- and inter-assay error in the ethanol assay 
method for each of the three substances and for comparisons among 
them; and 

4) if appreciable inter-assay error were found, to seek statistically 
identifiable sources of variation associated with variation in M. 


At least the usual opportunities for unidentified experimental error 
existed. In addition a recognized error in dosage made it necessary to 
discard one response group in assay 147. A ‘missing value” was 
computed in the usual manner [3] from the data of that assay for the 
combined analysis (Table 6). No significant deviations from linearity 
were found in any assay. The slope of the response to MT was non- 
parallel (P < .05) to the other slopes in all assays except number 147. 
In the combined analysis (Table 7) the significant findings of interest 
were the difference in the level of the response to MT and the variable 
differences between the regression for MT and the other regressions. 


The variance observed in M for each unknown was in agreement with 
prediction (Table 8). 
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TABLE 6 


EXPERIMENT ON InTeR-AssAy Error 
CopEp ReEsponsEs* 


Substance Daily Dose Assay Assay Assay Assay 
146 147 148 149 
Androsterone (S) 
Dose: 1 O.1ug. 65.0 alieriicate 74.5 80.6 
2 0.2ug. 77.6 81.5 79.9 79.3 
3 0.4yg. 87.6 90.2 92.1 88.8 
4 0.8ug. 93.4 96.7 100.8 100.3 
Mean 80.9 85.0 86.8 87.2 
Slope 31.7 27.9 30.4 22.9 
Methyl Testosterone (MT) 
Dose: 1 0.07ug. 65.7 68.8 66.7 Bist 
2 0.14ug. ifs | 84.6 79.3 81.3 
3 0.28ug. 80.2 90.0 90.1 90.8 
4 0.56ug. 86.2 98.8 105.8 101.2 
Mean HA 85.6 85.5 85.2 
Slope 21.2 31.8 42.7 36.7 
Extract (L125) 
Dose: 1 0.000025 d. 70.1 75.8 76.6 75.8 
2 0.000050 d. en 81.8 82.3 80.6 
3 0.000100 d. 87.2 89.8 91.2 89.7 
4 0.000200 d. | 102.5 95.8 103.0 100.1 
Mean 84.4 85.8 88.3 86.6 
Slope 35.0 227 29.4 27.3 
Mean slope 29.5 27.5 34.1 29.0 
s? (per response mean of 
11 chicks) (8) 9.2 9.5 10.4 


comb weight in mgm. 


*Response metameter 100 lo SS - 
e = body weight in decigrams 


** Missing value inserted. 
All £25 doses in fractions of a 24-hour urine specimen. 


The experiment provided gratifying evidence of the reduction in 
inter-assay error, but its potential usefulness in illuminating the problems 
was thereby reduced. The interpretation of the variable differences 
between MT and the other two substances is obscure. Although the 
non-parallelism would make MT formally non-assayable relative to 


142 BIOMETRICS, JUNE 1957 


TABLE 7 


EXPERIMENT ON InTER-AssAy ERROR 
ComBINED ANALYSIS OF VARIANCE ON CopED MEAN RESPONSES 


Source of variation iD), M.s. 
Assays 3 Sia 
Substances 2 SOF 
S vs. L25 1 12.8 
S + 125 vs. MT 1 50FoF 
Regression 1 A862. 7*** 
Substances X Regression 2 12.9 
S vs. 125 1 0.2 
S + 125 vs. MT 1 128 
Assays X Substances 6 9.4 
Assays < S vs. L25 3 6.2 
Assays X S + 125 vs. MT 3 12.3 
Assays X Regression 3 11.3 
Assays X Substances < Regression 6 22.8 
Assays X S vs. L25 X Regression 3 4.6 
Assays X S + 125 vs. MT X Regression 3 ANT O** 
Deviations from regression 23 6.6 
Within groups 503 oko 
*P < .05 
wKP < 01 
EP < .001 
TABLE 8 


EXPERIMENT ON InTER-AssAY HRROR 
M VALUES FOR THE Two UNKNOWNS 


Methyl Extract 
Testosterone 125 
Assay: 146 1.041 2S 
147 1.170 1.626 
148 il ily 1.646 
149 1.084 i By 
Unweighted mean 1.103 1.6438 
Mean observed variance 0.0030 0.0037 
Mean predicted variance 0.0055 0.0054 
= Mean observed variance * 
~ Mean predicted variance ee 0,885 
rea) SS Os = 05 
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androsterone, the agreement among replicate M values for MT was 
comparable to £25. A useful and apparently reliable estimate of its 
potency was therefore obtained despite the deviation from the analytical 
dilution model. 

Except in the case of /T there was no evidence of appreciable inter- 
assay slope variance in the experiment, which therefore was not an 
adequate test of the applicability of Finney’s model. However, the 
little evidence it provided tended to confirm the argument that assay 
interaction terms in the analysis of variance should not be the basis 
for predicting the error of MV. 


METHOD FOR INCORPORATING INTER-ASSAY ERROR INTO 
THE ESTIMATED PRECISION OF M 


For estimates subject to inter-assay error, particularly the replicate 
M’s from assay Series I and II, the best method for incorporating this 
error into individual estimates of precision required consideration. 


Basic Hypothesis 


The variance of the zth M for any unknown in these data 
(¢ = 1, 2, ---), derived from assay a, may be assumed to consist of 
oa; + o,, where o, is the intra-assay variance for that M; , and o? is 
the inter-assay experimental error. The intra-assay variances (0, 
were heterogeneous not only because of factors such as the number of 
dose levels administered, the number of chicks used, and the 7, obtained, 
but more fundamentally because of the inequality of the variance of 
individual responses (o”) in different assays. This assumption would 
place the M’s under consideration in the category of estimates with 
heterogeneous intrinsic variances and with experimental error or inter- 
action. Cochran [28] has shown that for such quantities, the weighted 
mean (M,,) is an unsuitable combined estimate, and the choice lies 
between the unweighted mean (M) and the generally more precise 
semiweighted mean (M,,,). Cochran and Bliss [3], whose discussions 
were oriented mainly toward the combination of k estimates of one 
quantity when the available evidence consists of the data underlying 
these k estimates, have given formulas for the variances of M and M,,, . 
Bliss has recommended that MW, be discarded in favor of M,,, when a 
xX’? test for heterogeneity is significant at P .05. However, the use of 
such a rule in analogous situations has been shown [29, 30] to result in 
underestimation of the variance in many cases, and more rigorous 
criteria have been considered [28, 30]. 

The effect of this rule was examined on the replicated potency esti- 
mates from assay Series I and II. The estimated variances of semi- 
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weighted means (J/,,,) for the 13 heterogeneous unknowns (Table 3) 
would be up to twenty times the predicted variance for M,,. According 
to the same rule, M,, and its variance, including no correction for inter- 
assay error, would be computed for the 62 unknowns (83%) with 
non-heterogeneous potency estimates. If, as we have postulated, the 
M,’s for all 75 unknowns were actually samples from a distribution with 
a variance o?, + o2 , the result would be an over-estimate of the error 
of apparently heterogeneous M’s and an underestimate of the error of 
apparently homogeneous M’s. The chance variation of individual 
‘M’s around their true value would result in the assignment of vastly 
different degrees of precision to pooled estimates merely according to 
the value of P arbitrarily set as the criterion for the computation of M,,, . 


Limitations and Consequences of Incorporation of the Average Estimate 


a2 


é.) of Inter-Assay Error. 


An alternative procedure for incorporation of inter-assay error into 
the estimated precision of MM in a situation where replicates of multiple 
unknowns are available would involve the use of an average estimate, 

° , to be applied to all potency estimates. ¢? may be used both in the 
estimation of the variance of an M7, from a single assay, as VUM;) + & 
and in the calculation of semiweighted means and their variances. In 
our data, the mean observed variance of M, , 0.0894, is equated to 
é.. + ¢ and 6. , as an estimate of the mean o;; , is equated to the 
mean V(M,), 0.0281. Then 6 = 0.0894 — 0.0281 = 0.0613. Thus, 
in the present case, ¢; is more than twice the mean V(./,), and its use 
would result, on the average, in the tripling of each estimate of variance. 
The limited precision thus assigned to M is a fair reflection of the assay 
error during the period when these estimates were obtained. The 
expanded variances of the 7;’s would be employed in the usual manner 
for weighting the M,’s in the calculation of M,, and V(M,,). For any 
k estimates with n chicks per unknown in one assay, V(J/,,) would be 
approximately (a7; + o:)/k, where o2; isa function of 1/n. For any nk, 
there is an obvious advantage in increasing / as much as possible. 

The reliability of the estimate of ¢2 from our data is of course limited. 
The assays in the two series were Pree) unequally according to 
the number of M’s from each one that met the criteria for inclusion in 
this study. Moreover, both the predicted and observed variances for the 
several M’s from any one assay were obviously correlated. In spite of 
these reservations about the validity of a particular o2 as a correction 
factor for inter-assay error, it would in our opinion provide a more 
accurate expression of the precision of 1, whether replicated or not, 
than other methods which have been proposed. 
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DATA FROM ASSAY OF PARATHYROID ACTIVITY 


An analysis of replicate 1/’s from another parallel line assay method 
can be added to the above data. Twenty-four valid assays for the 
calcium mobilizing effect of parathyroid hormone [31] yielded 35 M 
values for 17 unknowns.* The residual error variance (s*) and the assay 
slope (b,) both showed considerable variation, and in a number of 
assays the slope was determined with poor precision, as indicated by 
the values for g (Table 9). The unknowns were extracts of parathyroid 
glands prepared by different methods. V(J/) calculated by Equation 1 
showed considerable variation, but its mean value of 0.0352 compared 
well with an observed variance of 0.0307. The approximate X° for 
each unknown individually were significant at P .05 in two out of 17 
instances (12%). As judged by these comparisons, the internal assay 
statistics in this method provided fairly reliable predictions of the 
observed error despite the variability of the assays. 


TABLE 9 


CHARACTERISTICS OF 35 REPLICATE 1/’s ror 17 UNKNOWNS FROM 24 
PARATHYROID ASSAYS 


Mean assay slope (6-) 4,32 

Mean assay residual variance (s?) 1.383 

Range of g values 0.06-0.76 
Mean observed variance of MZ 0.0307 
Mean predicted variance of M 0.0352 


SUMMARY AND CONCLUSIONS 


Replicate estimates (7) of the log androgenic potency of two types 
of unknowns, urine extracts and synthetic steroids, were made with a 
biological assay method of the parallel line type. During the first part 
of the study, in which the vehicle for administration of the androgenic 
substances was ether, the observed variance of 169 M’s for 75 urine 
extracts was approximately three times as large as the variance pre- 
dicted by the usual computations based exclusively on intra-assay error. 

Rather than to divide the replicated M’s from these assays arbitrarily 
into homogeneous and heterogeneous estimates and to treat them 
differently, it was considered preferable to treat all of them as samples 


*Tabulations of the replicate M’s are not included in this report, but mimeographed copies of 
these data are available and will be supplied on request. 
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from a distribution with a variance of o2, + 0; , where oj; is the intra- 
assay variance and o2 is the inter-assay experimental error. The 
application of this concept to the computation of the variance of a single 
M, or of pooled replicate M’s was described. 

In the second part of the study, in which the vehicle for administra- 
tion of the androgens was changed from ether to ethanol, the observed 
variation between duplicate estimates for 28 urine extracts was in 
excellent agreement with the prediction. Thus a change in the assay 
procedure which permitted more accurate and reproducible administra- 
tion resulted in the elimination of inter-assay error for urine extracts. 
However, in the same series of assays, the observed error of 54 replicate 
estimates for 23 synthetic steroids was still significantly greater (by 
55%) than the predicted error. 

The disappearance of inter-assay error for urine extracts and its 
persistence in the assay of synthetic steroids could not have been de- 
tected by an examination of the intra-assay statistics alone; it was 
necessary to calculate the variation among replicate estimates. This 
conclusion was also borne out by results of another bioassay method, 
for parathyroid hormone activity, in which 35 replicate M’s for 17 
unknowns, were examined. In spite of a great variability with respect 
to s’ and slope and, in many cases, large values for g the variance of 
the replicated potency extimates was not significantly greater than the 
prediction from intra-assay statistics. 

It is emphasized that although the total error of biological assays 
tends to approach the intra-assay error as a minimum, inter-assay error 
cannot safely be ignored until its absence has been demonstrated. 
Moreover, once the presence of a relatively large inter-assay error has 
been recognized and placed on a quantitative basis, it is possible to 
work more effectively for improvement of the true precision of a bio- 
logical assay method by evaluating the effect of changes in procedure 
on the error between assays as well as within assays. 
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INHERENTLY LOW PRECISION OF INFECTIVITY 
TITRATIONS USING A QUANTAL RESPONSE 


G. G. MEYnELL 


Department of Bacteriology, Postgraduate Medical School of London, Ducane Road, 
London, W.12, England 


Infectivity* and pharmacological titrations appear at first sight 
to be very similar, not only in execution but also in interpretation. In 
both, the test preparation is serially diluted, a group of subjects is 
inoculated with each dilution and, in the case of quantal responses, to 
which this discussion is confined, the number of subjects responding to 
each dilution is recorded. The proportion of subjects responding is 
usually related to the logarithm of dose by a sigmoid curve which is 
roughly symmetrical. Finally, the potency of the preparation is often 
expressed in terms of the LD50, the dose causing 50% of subjects to 
respond, the precision with which this is done being clearly governed 
by the ‘steepness’ of the dose-response curve. The purpose of this 
paper is to suggest that the interpretations of infectivity and phar- 
macological titrations differ fundamentally owing to the completely 
different mechanisms by which the response is produced in the two cases 
and that infective particles act in a manner which restricts the steepness 
of the dose-response curve and consequently restricts the precision 
with which the #D50 of an infective preparation can be estimated. 
It will be shown that the greatest precision predicted for infectivity 
titrations is considerably less than that attainable in pharmacological 
titrations performed under the same conditions. 

The production of an infective response is usually accounted for by 
a hypothesis derived from pharmacology which postulates that there 
exists an Individual Effective Dose (1.E.D.) for each subject such 
that a subject is certain to respond if it receives a dose equal to or greater 
than its I.E.D. The shape of the dose-response curve is considered to 
be governed solely by the distribution of I.E.D.s amongst the subjects 
(Finney [1952a], Wilson and Miles [1955]), assuming that errors in dosage 
can be ignored. This hypothesis offers a reasonable explanation for 
responses to particles, such as drug molecules, which are not self- 
reproducing. Such particles must cooperate to produce the response 


*Infective’ is applied here to any system in which the response follows multiplication of the 
inoculated particles in the subject. Therefore it includes viable bacteria, viruses, tumour cells, protozoa, 
ete., and excludes drug molecules. 
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since, so far as is known, no response can be elicited by only one, or 
even a hundred, molecules. As even the smallest doses inoculated 
contain very large numbers of molecules, there will be negligible differ- 
ences between replicate doses due to sampling error and variability 
in response must be due chiefly to variation in resistance of the subjects. 
The situation is quite different for infective particles which can multiply 
after administration. First, there are a few systems in which the sub- 
jects are completely susceptible and the inoculation of a single particle 
invariably causes a response. Such subjects are exactly similar to 
tubes of nutrient medium in bacterial counts performed by the dilution 
method and the shape of the dose-response curve is determined by 
the random distribution of particles amongst doses. Therefore, S, 
the proportion of subjects which does not respond to a mean dose of d 
viable particles is equal to e’, the first term of the Poisson series. 
This argument does not apply to the majority of infective systems for 
most subjects are only partially susceptible and the HD50 contains a 
considerable number of particles. Nevertheless, the hypothesis of the 
I.E.D. is not necessarily applicable to such systems for it is possible 
that the response results from the multiplication of only a small number, 
perhaps only one, of the many particles inoculated. This alternative 
hypothesis, apparently first stated by Halvorson [1935], has been referred 
to as the ‘hypothesis of independent action’ by Meynell and Stocker 
[1957]. It postulates that infective particles act completely independently 
after inoculation so that the fate of one does not affect the fate of another. 
Hach viable particle then has a probability (1 > p > 0) of multiplying 
sufficiently to cause a response. If the subjects do not differ in resistance, 
S is again given by the first term of the Poisson series, e ?*. A plot 
of S against log d gives a slightly asymmetrical sigmoid curve which 
closely resembles an integrated normal curve with standard deviation 
of 0.5 (Irwin [1942]). Thus, the hypothesis of independent action 
predicts that there will be considerable variability in response and a 
fairly flat dose-response curve, even if the subjects are all of the same 
resistance. The two hypotheses are incompatible, for the hypothesis 
of independent action implies that it is impossible to be completely 
certain that a dose of any size will cause a response since the outcome 
of challenge is assumed to be governed by chance. The hypothesis of 
independent action cannot yet be regarded as established and, as some 
workers may question the assumption of independence in systems where 
the particles are not of maximum virulence (p < 1), there follows a 
brief account of experimental evidence which suggests that this hy- 
pothesis is almost always valid. 


The first two experiments are based on the following argument: 
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if independence obtains, only one particle will multiply sufficiently to 
produce a response in some of the subjects inoculated with many 
particles. This becomes very likely when a dose < 1 ED50 is inoculated 
(Meynell and Stocker [1957] Fig. 2). Hence: 

(a) In the first experiment, subjects are inoculated with a suspension 
containing equal proportions of distinguishable, equally virulent, 
variants of the same pathogen. Any subject which responds is sampled 
to determine the composition of the population of particles it then 
contains. If the hypothesis of independent action is valid, most samples 
from subjects responding to doses < 1 HD50 should contain at least a 
predominance of one variant whereas samples from subjects responding 
to many #D50 should yield all the variants in the proportions present 
in the inoculum. The authors who first used this test of the hypothesis 
inoculated mixtures of different species and therefore did not test for 
interaction due to species-specific effects. Thus, Kunkel [1934] smeared 
leaves with a suspension containing equal numbers of HD50 of tobacco 
mosaic virus (T.M.V.) and aucuba mosaic virus (A.M.V.). Only a 
single dose was used (< 1 #D50) and each subject (an isolated lesion 
on the leaf surface in this system) contained only one or other of the 
two viruses. The experiment was repeated by Lauffer and Price [1945] 
who included doses > 1 HD50. These produced an unexpectably high 
proportion of lesions containing T.M.V. alone, presumably owing to 
antagonism (interference) between T.M.V. and A.M.V. when both 
were present in the same lesion. Liu and Henle [1953] inoculated eggs 
with 1/32 — 32 HD50 of a mixture of influenza A and B viruses. In- 
fluenza A increases more rapidly than influenza B in this range of dosage 
with the result that eggs given 16 #D50 usually yielded a great excess 
influenza A after incubation. Nevertheless some eggs receiving doses 
< 8 HD50 yielded a great preponderance of influenza B, which strongly 
suggests that the fates of the inoculated particles were determined 
independently. A mixture of variants of the same species was first used 
by Zelle, Lincoln and Young [1946] who, using only one size of dose, 
exposed guinea-pigs to clouds containing four variants of Bacillus 
anthracis. Most of the fatally infected animals yielded one or other of 
two of the variants (which were presumably of higher virulence than 
the rest) but the results are difficult to interpret as neither the value of 
p (Druett, Henderson, Packman and Peacock [1953]) nor the mortality is 
known. Meynell and Stocker [1957] infected mice by intraperitoneal 
injection with a mixture of variants of either Salmonella paratypha 
B (p = 6.7 X 10°) or Salmonella typhimurium (p ~ 10°*), the doses 
being in the range, 0.2 — 10° ZD50. As predicted, the proportion of 
the variants recovered varied greatly from mouse to mouse when the 
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dose was 1 £D50 or less, and became steadily more uniform (and similar 
to the inoculum) as the size of the dose increased. Nevertheless, far 
fewer mice than expected yielded only one variant. This discrepancy 
is attributed, on independent evidence, to a breakdown in resistance, 
caused by the outgrowth of the ‘effective’ fraction of the inoculum, 
which enabled members of the ‘ineffective’ fraction to multiply and to 
appear in the sample. Mice were also infected by mouth with two 
variants of Salmonella typhimurium (Meynell [1957]; p ~ 5 X 10°) 
which were of equal virulence but unequal in their rates of increase 
in the subjects. Therefore, just as in the experiments of Liu and Henle 
[1953], nearly all mice inoculated with many /D50 yielded only one 
of the variants whereas those given 1 #D50 or less usually yielded 
either variant alone. The inoculation of a mixture of species or of 
variants of a single species has therefore always given results which are 
explicable on the hypothesis of independent action although the results 
may be disturbed by interaction (either antagonism or synergism) in 
the later stages of infection. 

(b) In the second experiment, the relation of dose to the latent period 
intervening between inoculation and response is determined. In most 
systems, decrease in dosage prolongs the latent period (see, for example, 
Beard, Sharp and Eckert [1955]). However, if most responses to doses 
< 1 HD50 are each due to the multiplication of a single particle, the 
latent period should tend to become constant for doses < 1 HD50. 
This has been observed with mice given Salmonella typhimurium by 
intraperitoneal injection (p = 10°° — 10°; Meynell and McCloy, in 
preparation). Dr. Bryan has kindly pointed out that plots of latent 
period against dose for several titrations of the Rous sarcoma virus 
(Bryan, Calnan and Moloney [1955]; Bryan [1956]) show apparently 
aberrant points for doses < 1 HD50 which are compatible with this 
prediction. 

In the third experiment, the effect produced by inoculation of a 
given number of particles in one dose is compared with the effect produced 
by the same total number of particles divided amongst doses which are 
inoculated either simultaneously at different sites or at different times 
by the same route. The hypothesis of independent action predicts 
that the proportion of subjects responding to a total dose of d par- 
ticles will be the same whether or not the dose is divided. The predicted 
result has been obtained on the two occasions on which the prediction 
has been tested (Hewitt [1953], Goldberg, Watkins, Dolmatz and 
Schlamm [1954]). 

Lastly, independence is a reasonable assumption to consider if it is 
borne in mind that the LD50 of a suspension of killed microorganisms 
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is far greater than the LD50 of live organisms. Tor example, the 
LD50 of dead Gram-negative bacteria (Salmonella, Escherichia) con- 
tains c.10" organisms while the LD50 of living bacteria is usually less 
than 10’. Also, Maalge [1948] and Rowley [1954] have shown that if 
the LD50 of an attenuated organism is measured with and without the 
addition of various numbers of killed bacteria of the same species, the 
size of the LD50 is unaffected unless more than 10’ killed bacteria are 
included in the inoculum. 

All the above points strongly suggest that the hypothesis of inde- 
pendent action is a useful model for most infections although interaction 
is known to occur in a few systems (Schneider and Zinder [1956], Gled- 
hill [1956]). 

The shape of the dose-response curve offers another means by which 
the validity of the hypothesis of independent action can be tested. 
As mentioned above, S = e ”, if the subjects are uniform in resistance. 
The dose-response curve will be flatter if the subjects differ in resistance 
(Armitage and Spicer [1956]) but its shape will not be altered by hetero- 
geneity in the virulence of the particles (Fazekas de St. Groth and Moran 
[1955]). Hence, the observed relationship between dose and response 
can be compared with that predicted for uniform hosts, first, to see 
if the relationships could be the same (Druett [1952]) and, second, to 
see if the observed dose-response curve is significantly steeper than the 
predicted curve, a finding that would immediately show that the 
hypothesis of independent action was invalid. The value of p can be 
estimated from the data either by the methods given by Haldane 
[1939] and by Peto [1953], or by methods devised for bacterial counts 
by the dilution method (Finney, 1952b, §21.5). The observations can 
then be compared with the predictions in two ways: either by a x* 
test (Haldane [1939]; or by the rapid test introduced by Moran [1954a, b]) 
to reveal discrepancies due to flattening of the observed curve. The 
latter test yields a quantity, M7, a normal deviate, so that a value of 
M > 1.645 indicates a significant (P < 0.05) departure from expectation. 

Table 1 gives details of infective systems in which the observed and 
predicted curves have been compared by at least one of these methods. 
Graphical comparisons can be found in Youden, Beale and Guthrie 
[1935], Bald [1937, 1950], Sprunt [1941], Parker, Bronson and Green 
[1941], Lauffer and Price [1945], Bang [1948], Kleckowski [1950], Fazekas 
de St. Groth and Cairns [1952], Goldberg et al. [1954], Beard, Sharp and 
Eckert [1955] and Eckert, Beard and Beard [1956]: all these curves 
appear compatible with or flatter than the predicted curve. The 
values of p given in the Tables indicate the maximum known values for 
each system. The values for viruses and tumours are only tentative 
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as it is usually technically impossible to obtain an absolute estimate of d, 
the mean number of potentially infective (¢.e. viable) particles inoc- 
ulated (Hoskins, Meynell and Sanders [1956]; Isaacs [1957]). Relative 
estimates of p can be obtained in two ways. <A suspension of particles 
can be titrated in two subjects of differing resistance, a method which 
yields a maximum value, as p may be less than unity for the more 
susceptible subject. Or, occasionally, the total number of particles is 
known either from direct counting or from chemical data: this provides 
a minimum value as an unknown fraction of the particles may be 
nonviable. 

All save six of the curves summarised in Table 1 are compatible 
with or are flatter than the Poissonian curve, suggesting that the hypothe- 
sis of independent action is applicable to these systems. Five of the 
exceptional curves are from Table IV of Parker, Bronson and Green 
[1941]. The agent was an attenuated strain of vaccinia virus, producing 
very indistinct lesions, which was titrated separately in each of six 
rabbits. The authors considered that more than one lesion had to be 
present at each inoculation site to produce a visible response and the 
five exceptional curves are compatible with the curve predicted on the 
assumption that a visible response would be produced by ten or more 
lesions. The curve for the sixth rabbit was compatible with the pre- 
dicted relationship S = e ”*. The sixth exceptionally steep curve was 
reported by Nadel, Fryer and Eisenstarck [1957] who inoculated chicks 
with Newcastle Disease virus. This titration has apparently been 
performed only once and it would be of considerable interest to repeat 
it to establish that the incompatibility with prediction did not arise 
solely from sampling error which would be expected to cause an occa- 
sional discrepancy. 

There are also many titrations reported which have not been com- 
pared with the Poissonian curve but have instead been analysed by 
probit methods. These can be compared with the predicted curve in 
a less exact manner. If S = e ”, as predicted for uniform subjects, 
probit S plotted against log d gives a slightly concave curve with 
slope, b, of 2.0003 at the #D50 point (Peto [1953]). This curve will 
approximate to a straight line with the same slope if the points are 
more or less symmetrically weighted around the #D50. Table 2 gives 
the slopes and 95% confidence limits for a number of infective systems. 
Only the titration with the greatest slope is given when more than one 
titration is reported. It will be seen that none of the slopes is signifi- 
cantly greater than 2, the approximate maximum value predicted by 
the hypothesis of independent action. Thus, the observations recorded 
in the Tables support the general validity of this hypothesis. 


TABLE 2 


Dosr-Response Curves From Infectivity TITRATIONS: 
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SiopEs OBTAINED By Propit ANALYSIS 


1957 


No. of 
Agent Subject Route | titrations Dp 
VIRUSES 
Eastern equine encephalo- Chick embryo A 49 — 
myelitis virus Ys 
Influenza virus Tissue culture —- 3 
pooled <2.3) % 102 
Influenza virus Chick embryo A 2 
pooled >0.1 
Avian erythromyeloblastic Chick IV 3 ~™~107 
leukosis virus 
Rous sarcoma virus Chicken sc 1 >0.02 
Rous sarcoma virus Chick sc 2 >0.02 
BACTERIA 
Br. suis Guinea pig R 4 — 
Br. suis Guinea pig R 7 — 
Br. melitensis Guinea pig R 2 — 
B. anthracis Guinea pig 12 5 = 
B. anthracis Monkey R 2 — 
Salm. typhimurium Mouse We 2 ~103 
pooled 
Salm. typhi (+ mucin) Mouse IP 1 ~8s8 X 10-4 
H. influenzae Mouse IC >40 ~103 
The symbols are those used in Table 1. 
CONCLUSIONS 


1. The dose-response curve for infectivity titrations will always be 
relatively flat, even if subjects of uniform resistance could be used, 
and the slope obtained by probit analysis is unlikely to exceed 2. In- 
fectivity titrations can therefore never be of the same precision, other 
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b (95% 
Agent confidence n logio f Author 
Eastern equine encephalo- 2.02 9-11 0.5 Crawley [1948] 
myelitis virus (1.02 — 3.02) 
Influenza virus 1.74 55 1.0-0.7 |Fulton and Armitage ]1951] 
(1.26-2.22) 
Influenza virus 1.42 27 1.0-0.3 |Fulton and Armitage [1951] 
(0.97-1.86) 
Avian erythromyeloblastic 0.379 115-119 0.7 Eckert, Beard and Beard 
leukosis virus (0.297-0.461) {1951] 
p from Isaacs [1957] 
Rous sarcoma virus 1.2 6-10 120 Bryan, Calnanand Moloney 
[1951] 
b from Fig. 4 
p from Epstein [1956] 
Rous sarcoma virus 1.81 20 1.0 Bryan [1956] 
Br. suis 2.86 6-20 var. Elberg and Henderson [1948] 
(1.67-4.05) 
Br. suis 2.38 20-50 var. Druett, Henderson and 
(1.43-3.34) Peacock [1956] 
Br. melitensis 1.76 20 <0.6 |Elberg and Henderson [1948] 
(1.00—2.51) 
B. anthracis 2.54 20-40 var. Druett, Henderson, Packman 
(1.69-3.39) and Peacock [1953] 
B. anthracis 3.19 7-8 var. Druett, Henderson, Packman 
(1.5-4.88 and Peacock [1953] 
Salm. typhimurium 0.66 12 0.48  |Meynell and Stocker [1957] 
(0.34-0.98) 
Salm. typhi (+ mucin) 0.66 10 0.3-0.7 |Batson [1949] 
H. influenzae 1.46 15 0.6-0.7 |Irwin and Standfast [1957], 
(1.28-1.64) Table 23 


The symbols are those used in Table 1. 


things being equal, as pharmacological titrations, many of which have 
slopes in the range, 5-20 (Gaddum [1933], Bliss and Cattell [1943]). 
2. The precision of infectivity titrations will not be greatly increased 
by the use of subjects specially bred for uniformity as variability in 
response does not result principally from heterogeneity in the resistance 


of the subjects. 


Moderate heterogeneity in resistance causes only a 


slight distortion of the curve predicted for uniform subjects (Armitage 
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and Spicer [1956]), which is presumably why many observed curves are 
compatible with the Poissonian curve. 

3. If the observed and Poissonian curves are compatible, the con- 
centration of infective particles can be rapidly estimated from the data 
by means of tables published for use with bacterial counts by the 
dilution method (Finney [1952b]). 


SUMMARY 


1. Experimental evidence suggests that infective (¢.e. self-reproduc- 
ing) particles act independently after inoculation, and do not cooperate 
as postulated by the hypothesis of the Individual Effective Dose. 

2. Consequently, the dose-response curve for either completely 
susceptible subjects, or partially susceptible subjects of identical resist- 
ance, will be derived from the first term of the Poisson series when a 
quantal response is observed. 

3. Considerable variability in response will therefore always be 
present in an infectivity titration using a quantal response, even if the 
subjects are of the same resistance, and this will be considerably greater 
than that observed in many pharmacological titrations. The observed 
variability will be increased if the subjects differ in resistance. 

4. A search of the literature has shown that nearly all reported 
infectivity titrations yield dose-response curves which are compatible 
with these predictions. 


I wish to thank Dr. P. Armitage, Dr. C. C. Spicer and Dr. B. A. D. 
Stocker for many valuable suggestions. 
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MULTIPLE RANGE TESTS FOR CORRELATED 
AND HETEROSCEDASTIC MEANS* 


Davin B. Duncan 
Universities of Florida and North Carolina, U.S.A. 


1. INTRODUCTION 


Multiple range tests have been developed by several writers, for 
example, D. Newman [8], M. Keuls [5], J. W. Tukey [10] and D. B. 
Dunean [3], for testing differences between several treatment means 
in cases in which all such differences are of equal a priori interest. 
These tests, which are also described in recent textbooks, for example, 
W. T. Federer [4, chapter 2], have been worked out for data in which 
the treatment means are homoscedastic (have equal variances) and 
are uncorrelated. Recently, C. Y. Kramer [6] has presented a simple 
method for extending these procedures to give useful tests for differences 
between means with unequal replications, the method being applicable 
to any set of heteroscedastic uncorrelated means. In a subsequent 
paper [7], the same author has given further extensions to tests of 
means which are also correlated, such as the adjusted means from 
analyses of covariance or from incomplete block designs. Similar 
work has also been done by E. Bleicher [1] and P. G. Sanders [9] in 
extending a multiple / test to making tests in lattice and rectangular 
lattice designs. 

One purpose of this paper is to present a more complete method for 
these extensions which necessarily sacrifices a little in simplicity but is 
more powerful, especially in cases in which the differences between the 
means have appreciably different variances. Another purpose is to 
indicate briefly the closeness of the properties ‘of these complete tests 
of heteroscedastic and correlated means to those of the corresponding 
tests of homoscedastic and uncorrelated means. Incidental to these 
main purposes, a short-cut skipping principle, useful in applying multiple 
range tests to a large number of treatment means (or totals), is also 
presented. 


*Research jointly supported by the Florida Agricultural Experiment Station, by the U. S. Public 
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2. BASIC RULE FOR COMPLETE TESTS 


Let m, , m2, --+ , m, represent m normally distributed means such 
that the variance of the difference between each pair can be written 
V(m; — m;) = k;;0" where k;; is known and o” is an expected error mean 
square. Let s° with n. degrees of freedom be the usual type of analysis 
of variance error mean square estimate for o”. In other words, n.s°/co” 


is distributed as x;,, and is independent of the means m, , m2, +++ , Mp . 

Call a;; = V2/k,; the adjustment factor for and (m; — m;)' = 
a;;(m; — m;) the adjusted difference between the means m,; and m, , 
and call Rf = s-z, the critical value for p means, where z, is the Stu- 


dentized significant range for p means for n» degrees of freedom and 
for an a-level test. 

The proposed complete basic rule for an a-level multiple range test 
may then be expressed as follows: ANY SUBSET OF p MEANS IS HOMO- 
GENEOUS IF THE LARGEST ADJUSTED DIFFERENCE IN THE SUBSET FAILS 
TO EXCEED THE CRITICAL VALUE R/. ANY TWO MEANS NOT BOTH CON- 
TAINED IN THE SAME HOMOGENEOUS SUBSET ARE SIGNIFICANTLY 
DIFFERENT. ANY TWO MEANS BOTH CONTAINED IN THE SAME HOMO- 
GENEOUS SUBSET ARE NOT SIGNIFICANTLY DIFFERENT. 

This is the same as the basic rule implicitly adopted by Kramer [6] 
except that in the latter a subset of p means is declared homogeneous 
if its adjusted range does not exceed Ri . If an adjusted difference 
within a subset exceeds the adjusted range, as it may do through 
having a smaller variance and hence a larger adjustment factor, it 
will be significant by the complete rule and this may also result in the 
detection of further significant differences. 


3. NUMERICAL EXAMPLE I: TEST OF 
UNEQUALLY-REPLICATED MEANS 


Table 1 illustrates a convenient method for applying the complete 
rule. The example consists of the application of a 5 per cent level 
new multiple range test [3] to a set of seven unequally-replicated treat- 
ment means from a completely randomized design. A similar extension 
of any multiple range test, e.g., [5], [8], and [10], could be made by the 
same method, the only difference being in the source used for the 
Studentized significant ranges z, , in section (b). Table 2 gives details 
of the computation of the adjusted differences used in Table 1. 

The initial preparation of the data is the same as for Kramer’s 
method [6]. Table 1, section (a) shows the analysis of variance con- 
cluding with the calculation of the error standard deviation s = 73.45. 
Section (b) shows the computation of the critical values Rk; = 8-2, , 
the Studentized significant ranges z, having been taken from [3, Table IT] 
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TABLE 1 


5 Per Cent Leven New Mutriete Rance Test [3] or Seven UNEQUALLY- 
REPLICATED MEANS 


a) Analysis of Variance 


Source IDs Ms. s = V/m.s. 
Between treatments 6 
Error 16 5,394.6 73.45 
b) Critical Values: Ri = s.2p 
p: (2) (3) (4) (5) (6) (7) 
Hae 3.00 3.15 3.23 et) 3.304 3.37 


The 220.4 25s 237.2 242.4 245.3 DYN ts 


c) Ranked Treatment Means and Replication Numbers 


D PF A B Cc E G 
680 734 743 851 873 902 945 
(3) (2) (5) (5) (3) (2) (3) 
d) Test Sequences 
Seq. Steps Result 


1. (@ — D) > Ri, (G — F)' > Rt, (G—A)' > Ri, (@—B) > Ri. (BCEG) 
2) (he= DY SRL = FY sR PABCR (fA) Re 
FBCE: (EH = F)' > RLBCE. = 
3. (C — DY > Ri, C= FY SRL, FABC CO] Ay Re. 
FBC:(C — F)' »R3,(C — B)' + R3,(B —F)' + Rs. (FBC) 
4: (B= D) SRE, (8 — FY + Rin FAR C =A Ss Ree ey — 


iS (Ali ION Ss Sie (DFA) 
e) Final Result 
(DFA) (FBC) (BCEG) 


Any two means nol appearing together within the same parentheses are signifi- 
cantly different. Any two means appearing together within the same parentheses 
are not significantly different. 


for a 5 per cent level test entering at the row for n. = 16 degrees of 
freedom, Section (c) shows the treatment means ranked in ascending 
order together with their respective replication numbers in parentheses. 
In any test of uncorrelated means it is helpful to list under the ranked 
means measures, such as replication numbers in this case, which provide 
a quick method of visually assessing the relative magnitudes of the 
variances of the means and hence of the variances of the differences 
between them. 

The main part of the test is in the sequences of steps in section (d). 
Kach step consists of an application of the basic rule to a particular 
subset. Sequence 1 consists of steps involving all subsets in which 
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the top mean G is the largest mean, sequence 2 involves all subsets in 
which the second mean £ is the largest mean, and in general, sequence 
7 involves all subsets in which the i-th mean is the lar gest mean. 

The order of steps in each sequence is the same at the beginning as 
that of previous procedures [3] and [6]. The steps in sequence 1, for 
example, consist of testing the adjusted ranges first of the whole set 
DFABCEG, then of the subset FABCEG, then of ABCEG and so on. 
At each step the lowest mean is dropped to give the subset for the 
next test. 


TABLE 2 


ARITHMETICAL DrvrarLs FOR CALCULATING ADJUSTED DIFFERENCES IN TABLE 1, 
SEcTION (d) 


Q;; = V2/ki; = \2/ (2+ 3) = VJ 2rr Jr; +73), 


where 7; , 72, -** , 7; are the replication numbers for the respective means, thus 
(G — D)’ = (G — D)aen = 265 V2(3)(8)/6 = 265(1.732) = 459.0. 

Similarly 

(G — F)’ = 211(1.549) = 326.8, (G — A)’ = 202(1.936) = 391.1, 

(G — B)’ = 94(1.936) = 182.0, (EH — D)! = 222(1.549) = 343.9, 

(EH — F)’ = 168(1.414) = 237.6, (EZ — A)’ = 159(1.690) = 268.7, 

(C — D)’ = 193(1.732) = 334.3, (C — F)’ = 139(1.549) = 215.3, 

(C — A)’ = 130(1.936) = 251.7, (C — B)’ = 22(1.936) = 42.6, 

(B — F)’ = 117(1.690) = 197.7, (B — D)' = 171(1.936) = 381.1, 

(B — A)’ = 108(2.236) = 241.5, (A —D)'= = 122.0. 


63(1.936) 


The changes in the complete test come in each sequence when the 
adjusted range of a subset of p means fails to exceed hf. If the adjust- 
ment factor for the range is smaller than that of any other difference 
in the subset, or, in a test like this of unequally-replicated means, if 
either of the extreme means has fewer replications than any of the other 
means, any of the other differences with a larger adjustment factor, 
should also be tested. In such cases it is helpful to write down the subset 
concerned as a reminder that it will still be the subset under test until 
an adjusted difference within it is found to exceed kf. For example, 
when the adjusted range (C — F)’ of FABC fails to exceed Rj in step 2 
of sequence 3, the subset FABC is written down before testing the 
adjusted difference (C — A)’ in the next step. This serves as a reminder 
that (C — A)’ must be compared with #{ and not R4 as would otherwise 
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have happened. Similarly in the next three steps, the preliminary 
recording of FBC serves as a reminder that (C — F)’, (C — B)’ and 
(B — F)' each have to be compared with FR . 

When an adjusted difference between the top mean in a subset of 
p means and an internal mean is found to exceed R , the internal mean 
is dropped to give the next subset to be tested. For example, in step 3 
of sequence 3 FABC is reduced to FBC in this way by dropping A 
when (C — A)’ exceeds fj . 

When an adjusted difference, not involving the top mean, is found 
to exceed PR’ , two subsets may qualify for further testing in the same 
sequence. For example, if four means were ranked and had replication 
numbers as follows 


ie S R Q 
(1) (50) (100) (1) 


the testing steps could be 
(QP) Oe Ri « PSRQU(Q = ase lea (Qt) en, 
(RiP) oe Re = Ss) Shae 


and the subsets PRQ and PSQ would both qualify for testing in further 
steps in the same sequence. 

The testing for a subset terminates either when it is shown to be 
homogeneous, which fact is recorded by noting the subset in parentheses 
in the result column at the end of the sequence involved, or when the 
subset is found to be completely included within another subset already 
shown to be homogeneous. For example, BCEG is recorded (BCEG) 
at the end of the first sequence to denote its homogeneity. This follows 
from the fact that the adjusted range (B — G)' of BCEG does not 
exceed A{ and neither of the replications 5 and 3 of B and G is less than 
the replications 3 or 2 of C or H. The result (DFA) of sequence 5 is 
of a similar form. In other cases, e.g. the result (/BC) in sequence 3, 
it is Sometimes necessary to test each adjusted difference in the subset 
before it can be declared homogeneous. 

Sequences 2 and 4 provide no additional homogeneous subsets 
because they terminate at subsets BCE and FB, which are included in 
(BCEG) and (FBC) respectively already shown to be homogeneous. 

[t should be noted in conclusion that it is possible for more than one 
homogeneous subset to be found in a single sequence. For example, 
in the case of the means PSRQ discussed above the sequence involved 
could terminate with the results (PRQ) and (PSQ) or even (PQ) (SQ) 
and (RQ) depending on the other data involved. 

Section (e) of Table 1 shows a useful way of presenting the results of 


MULTIPLE RANGE TESTS 169 


the test. The device of presenting the whole set with homogeneous 
subsets underscored as is done in [3] and [6] cannot be used here because 
of the differences in the order of the means in the various homogeneous 
subsets. For example, A is to the right of F in (DFA) but not in (FBC). 
The new method of putting homogeneous groups in parentheses can 
also be used in tests of equally replicated means and may be preferred 
for printing purposes. 


4. NUMERICAL EXAMPLE II: TEST OF TREATMENT TOTALS IN A 
SIMPLE LATTICE DESIGN (Including the Use of Skipping Short Cuts). 


Table 3 illustrates the application of a similar 5 per cent level test 
to the adjusted totals in a 5 X 5 simple lattice design. The data are 
those given by Cochran and Cox [2, section 10.29] for a design with 
two repetitions. Table 4 gives additional details of the computation 
of the adjusted differences used in Table 3. 

Section (a), Table 3, shows the value of s obtained from the error 
mean square s* (denoted HZ, in [2]) for the experiment and its degrees 
of freedom n,. Section (b) shows the adjustment factors for differences 
between treatment totals (totals being more convenient than means to 
use in a case like this). 

Cochran and Cox give 2#,[1 + (n — 1)p]/r, (their n being the 
number (2) of repetitions involved) for the estimated variance of a 
difference between two means for treatments in the same block. Thus, 
using k,,0° to denote the variance of a difference between totals for 
treatments in the same block we have k,, = 2r[l1 + (n — 1)u]. Then 
using a,, for the corresponding adjustment factor, we have a,, = 
V2/ki. = (rfl + (mn — 1)y))-°””. Similarly, if a,_ is used to denote 
the adjustment factor for differences between totals of treatments not 
in the same block, we have a,- = V/2/k,_ = (r[1 + my])°°. In 
this example, r = 4, n = 2, w» = 0.1270 and the adjustment factors 
work out to be as shown in section (b). 

Section (c) shows the ranked treatment totals and critical values 
required for the test. The arrangement is different from the correspond- 
ing sections of the previous example solely because of the largeness of 
the number (25) of treatments involved. The new arrangement is 
convenient for applying a skipping method which short cuts many of 
the steps at the beginning of each sequence. In all other respects the 
procedure is unchanged. In the first column the treatments 1, 2, +--+ , 25 
as they have been denoted in [2] are redenoted A, b, --- , Y for con- 
venience in the recording of treatment subsets. The number (7-j) 
appearing after each treatment letter denotes the blocks in which the 
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treatment falls. Thus (3.1) after treatment K shows that it belongs 
to block 3 and to block 1 in the first and second types of replicates, 
respectively. These numbers are useful in indicating which treatments 
do and which do not appear together in the same block and thus which 
adjustment factor applies to each difference. 

The second column of section (c) shows the adjusted treatment 
totals from [2] followed by doubly adjusted treatment totals in paren- 
theses which, for brevity, we will call treatment totals and adjusted 


TABLE 3. 


5 Per Cent Leven Muuriete Rance Test or ApsusTeD TREATMENT TOTALS FROM 
A 5 X 5 Stwexte Larrich Design 


a) From Analysis of Variance 


me = 56, s? = 13.60, $ = 3.69 
b) Adjustment Factors for Differences between Treatment Totals 
Two treatments in same block: a4, = A471 
Two treatments not in same block: a,_ = .447 


c) Ranked Treatment Totals and Critical Values 


Treatment Total p 2p i 
11K (8.1) 88.4(39.5) 25 3.47 12.80 
2B(1.2) 77.3(34.6) 24 3.47 12.80 
150(3.5) 74.7 (33.4) 23 3.47 12.80 
14N (3.4) 71.6(382.0) 22 3.47 12.80 
24X (5.4) 70.6(31.6) 21 3.47 12.80 
22V (5.2) 68.1(30.4) 20 3.47 12.80 
1A (1.1) 66.6(29.8) 19 3.46 WAIT 
21U (6.1) 61.4(27.4) 18 3.45 12.73 
4D (1.4) 58.8(26.3) 17 3.44 12.69 
16P (4.1) 58.3(26.1) 16 3.43 12.66 
23W (5.3) 55.7(24.9) 15 
25Y (5.5) 52.7(23.6) 14 
13M (3.3) 52.7(23.6) 13 
18R(4.3) 52.6(23.5) 12 
207'(4.5) 51.6(23.1) iL 
12L(8.2) 51.1(22.8) 10 
5H (1.5) 50.9(22.8) 9 
7G (2.2) 47.6(21.3) 8 3.28 12.10 
6F (2.1) 46.9(21.0) G 3.25 11.99 
10J (2.5) 46,2(20.7) 6 
17Q (4.2) 46.0(20.6) 5 
8H (2.3) 45,2(20.2) 4 
3C(1.3) 44.9(20.1) 3 
97 (2.4) 38.1(17.0) 2 


19S(4.4) 21.5(9.6) 
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TABLE 3—(Continued) 


d) Test Sequences 


Seq. Steps Result 
1. 39.5 — Ri; = 26.70, 39.5 — Ri = 27.4. 
Tal(S—— (00) S008 eee — DY Seg Aer (AVXNOBK) 
2. 34.6 — Roy = 21.80, 34.6 — Rig = 21.94. 
a4,(B — E) > Rie. (ELTRMYWPDUAVXNOB) 


3. 33.4 — Rts = 20.60, 33.4 — Ri, = 
20.63, 33.4 — Ris = 20.67. 


a,.(0 — J) > Ris. FGEL --- 0: — (FGELTRMYWPDUAVXNO) 
4, 32.0 — Rie = 19.20, 32.0 — Roo = 19.20 
CHQJ --- N:-— (CHQJFGELTRMYWPDUAVXN) 


5. 31.6 — Ra = 18.80. a 
6. 30.4 — Rbo = 17.60. — 
Teo 8) — erg — £7.08. 


8. 27.4 — Ris = 14.67, — (CHQJFGELTRMYWPDU) 
last 9.6 + Riz = 22.29, 9.6 + Rs = 21.70, 
SICHQJFG: a,.(Q — S) > Re. (SICHQJFG). 


e) Final Results 
(SICHQJFG)(ELTRMYWPDUAVXNOB) 
(ICHOJFGELTRMYWPDU)(AVXNOBK) 
(CHOJFGELTRMYWPDUAVXN) 
(FGELTRMYWPDUAVXNO) 


Any two treatments nol appearing together within the same parentheses are 
significantly different. Any two treatments appearing together within the same 
parentheses are not significantly different. 


treatment totals, respectively. Each adjusted treatment total in 
parentheses is obtained by multiplying the corresponding treatment 
total by the smallest adjustment factor. In this example there are only 
two adjustment factors, the smaller being .447, hence the adjusted 
total for K, for instance, is 88.4(.447) = 39.5 as shown. The column 
of adjusted totals is a new feature needed in the skipping short cut 
steps. 

The last two columns of section (c) show the Studentized ranges z, 
and the critical values ?/ = 3.69z,. The middle column for p helps 
in identifying the z, and R/ values. The z, values in this example are 
obtained from [3, Table IT] for a 5 per cent level test the same as in the 
previous example except that for larger values of p some simple linear 
interpolation is needed. When a large number of treatments is involved 
as in this example, not all of the critical values Rf will be required. 
Each one should thus be obtained only as needed in the sequence steps. 
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In Table 3, for instance, only 12 of the possible 24 R/ values are ultimately 
found to be needed. 

Section (d) shows the main part of the test arranged in steps within 
sequences as in the previous example. The first two steps in sequence | 
are skipping steps and short cut the individual testing of 17 differences. 
In the first step, the largest critical value, R3, = 12.80 is subtracted 
from the largest adjusted total 39.5 (for K) giving 26.70. From this 
it is concluded that all treatments with adjusted totals below 26.70, 
namely D, P, --- , S, are significantly lower than K.. They can thus 
be dropped from the set leaving the subset UA --- K for testing in the 
second step. 

The truth of this conclusion is readily seen as follows:—Consider 
any one of the differences concluded significant, say kK — M for example. 
We have M(.447) < K(.447) — Ri, implying (K — M)(.447) > Ri, , 
thus (K — M)(.471) > BR’, , since RZ, > Ri, ,andhence(K — MV)’ > Ki. 
Similarly each of the adjusted differences concerned exceeds its cor- 
responding critical value and all are thus significant. 

In the second step testing the 8-treatment subset UA --- K ina 
similar way, the largest critical value ?{ involved is subtracted from 
the top adjusted total giving 27.4. If there were any further adjusted 
totals below 27.4 the treatments concerned could be dropped off and 
another similar step would be applied. In these data no further treat- 
ments can be dropped and the skipping method terminates at this 
second step. ‘The remainder of the sequence is finished by steps of the 
type already described in the first example, and for which the arith- 
metical details are given in Table 4. Thus in step 3, a,,(K — U) > Ri, 
and in step 4, a,,(K — A) > R!. This terminates the sequence 
since K — A has the larger adjustment factor a.., and no other difference 
within the subset AV --- K can exceed R/. This result may be usefully 
recorded as before by putting the subset in parentheses as shown in the 
result column of section (d) at the end of sequence 1. 


TABLE 4 


ARITHMETIcAL DpraILs FoR CaLcuLatTiING ApsusTep DirreRENcES IN TABLE 3, 
Section (d) 


(K — U)! = a4,(K — U) = .471(88.4 — 61.4) = .471(27.0) = 12.72. 
Similarly 
(K — A)’ = .471(21.8) = 10.27, (B — HE)’ = .471(26.4) = 12.43, 


(O — J)’ = .471(28.5) = 13.42, (Q — S)’ = .471(24.5) = 11.54. 
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Sequence 2 is very similar to sequence 1. The skipping procedure 
starts by subtracting R/, from the second highest adjusted total 34.6 for 
B and terminates again in the second step. The remainder of the sequence 
terminates at the third step with EL --- B being found homogeneous. 

In sequence 3 an additional treatment is dropped in the second step 
and the skipping procedure extends to the third step. Continuing the 
remainder of the sequence, a,,(O0 — ,/) is found to exceed R/, in the 
fourth step, a,_(O — F) is known not to exceed R/, from the preceding 
skipping steps so FGEL --- O is recorded at the fifth for further internal 
testing. The largest (O — ?) difference with an a., adjustment factor 
isO — EF. However a,,(O — EF) cannot exceed R/, since a,,(B — E) 
> R{, in sequence 2 hence FG --- O is homogeneous and the sequence 
terminates. Sequence 4 has only two skipping steps and terminates in 
a similar way. Sequences 5, 6 and 7 each terminate rapidly at the 
first step when the subsets concerned are found to fall entirely within 
(CH --- N) of sequence 4. In sequence 8, the difference between the 
adjusted total for J, that is, 17.0 and the critical level 14.67 is so great 
as to leave no doubt of the final result (JC --- U) at the end of the first 
step. 

As soon as all treatments but the lowest have been included in 
homogeneous subsets, as is the case in the example at the end of sequence 
8, the test can be completed in one reverse type of sequence working 
from the bottom total. The reasons for this will be evident from the 
steps of the last sequence in section (d). The largest possible homo- 
geneous subset in which the bottom treatment S could be included at 


this stage is SJ --- D and contains 17 treatments. The first step is to 
obtain S + R/, = 9.6 + 12.69 = 22.29. From this it follows that all 
treatments with adjusted totals above 22.29, that is #, L, T, --- are 


significantly larger than S. This leaves the 8-treatment subset SJ --- G 
for testing in the next step. In the second step S + Ri = 21.70, no 
additional adjusted totals exceed this and the skipping procedure 
terminates. Since the range G — S of the subset has the adjustment 
factor a._ we already know from the step 2 that a,_(@ — S) > Rf 
hence SICHOJFG is recorded for internal testing. The largest (? — S) 
difference with the a... adjustment factor is (Q — S) and this is therefore 
tested in step 3. Since a,,(Q — S) > R{ the subset is homogeneous, 
and is recorded (SICHOJFG). ‘This terminates the test. 

Section (e) of Table 3 shows the complete summary of the test 
results. In this example the ordering of treatments does not vary from 
one homogeneous subset to another. In such a case the method of 
representing the results by underscoring a single set of treatment letters 
as is done in [3] may be used if preferred. 
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5. NOTES ON THE PROPERTIES OF THE PROPOSED TEST 


Two-mean significance levels: A two-mean significance level in a test 
of n means may be defined [3] as the maximum probability of finding a 
significant difference between any two means m, and m,; given that 
wu; = p; where »; = E(m,) and »; = E(m;). This may be written as 
the max-P[D,; | u; = u;] where D,; denotes the decision that m; and 
m, are significantly different. 

In any a-level test of the proposed type we have max: P[D,; | wu; = 
bj) = Pla me | seat ie Bal BOUNCE 2p V2 De 
where ¢,,.¢ is the a-level (two-sided) significant value of ¢,, , a t statistic 
with n, degrees of freedom, and since the variance of m; — m, is 20°/aj; , 
this readily reduces to 


Pi| tae 


Hence the two-mean significance levels in an a-level test of the 
proposed type are exactly a as desired. 

Higher order significance levels and power: In considering these further 
aspects of the proposed test it is helpful to study the decision regions 
for a 5 per cent level test [3] of three unequally replicated means m, , m2 
and m; with r, = 2,r. = 3 andr; = 4 replications and in which n, = © 
and s’ = o = 1. If these regions are plotted in a plane with coordinates 


1 = (mm — m2) V 2ryro/(r1 + 72) 


SS ab ee gee 


and 


Lo = (rym, + PMs) — (Tr, + 72) Ms V 2r3/ (ry + re) + 2 + Ts) 


as is done in Figure 1, they are directly comparable to those of the 
corresponding 5 per cent level test of three equally replicated means 
(r, = r. = 73) shown in Figure 3 of [3]. The distribution function for 
the points (x, , 2.) is the same in both cases, namely a bivariate normal 
with variances 2 and 2 and with covariance zero. 

Because the strip regions (1, 2), (1, 3) and (2, 3) have the same 
minimum widths in each case (22... = 2 X 2.77) it follows, as has 
already been proved, that the two-mean significance levels are 5 per 
cent for the test in Figure 1 as well as for the test with equal replications. 

The only differences between the regions in the two cases is that in 
Figure 1 the angles between the strip regions (1, 2), (1, 3), (2, 3) and 
(1, 2) are 50°46’, 67°48’ and 61°26’ instead of all being 60° as in the 
other figure. (The cosine of the angle between any two strips (b, 1) 
and (hb, }) is given by ie 


Vrs [On Ty r;))). 
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61°26' 
Za 


= ———', 


Co" 48! 


FIGURE 1 


5% level test, mn = 2,72 = 3,73 = 4,m = ~, o? = 1, 


The sides of the hexagonal regions (1, 2, 3) are parallel with the 
corresponding strips. Since these hexagons have the same inscribed 
circle of radius 2;,.. = 2.92 and differ only in having a little asymmetry 
in Figure 1 the three-mean protection level (the probability P[(2, , v2) e 
(1, 2, 3)|H(a,) = E(x.) = 0)) for the Figure 1 test is a close approximation 
to the desired level .9025 obtained in the other test. Furthermore, it 
seems safe to assume that any deviation due to the asymmetry would 
be positive. 

In terms of three-mean significance levels, the level of the Figure 1 
test may thus be said to be close to .0975 (= 1 — .9025) as desired and 
that any deviation from this appears to be on the negative or conserva- 
tive side. 

The close similarity of the regions of Figure 1 with those of Figure 3 
[3] also indicates that the power functions of the Figure 1 test closely 
approximate the desirable ones of the other procedure. 

If the means are correlated as well as being heteroscedastic the 
geometrical picture is virtually unchanged. If we let [c;;];,.3 represent 
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the dispersion matrix in the ease of three means the cosine of the angle 
between any two strips (h, 1) and (h, J) in a set of regions otherwise 
similar to those of Figure 1 would be 


(Cin — Cas — Gop ck ¢:;)/(V Can Cac Cii) (Can Ny ee C;;) 


This may be expressed in terms of the k;; factors (k,;0° = ¢:; — 2¢;; + ¢;;) 
defined in section 2, as 


(kas 4 haz — bis)/2-V kackn; 


and the degree of asymmetry involved depends on these. 

Similar considerations lead to the conclusion that the higher order 
levels of the proposed complete test and its power functions are reason- 
ably close to the desired levels and functions existing in corresponding 
tests of uncorrelated and homoscedastic means and that the deviations 
involved appear to be on the conservative side. 
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APPROPRIATE SCORES FOR REACTION CATEGORIES 
DEPENDENT ON TWO VARIABLES* 


JOHANNES IPSEN, JR. 


Associate Professor of Public Health, Harvard School of Public Health, Superintendent, 
Massachusetts Institute of Laboratories, U.S.A. 


1. In an earlier paper (Ipsen [2]) the appropriate reaction scores for 
bio-assays were given where the reactions to various doses of a biologic 
agent were observed in biological, mutually exclusive categories. Death 
times and survivor symptoms were examples of such categories. Re- 
actions to anti-emetic drugs can be similarly treated (Ciminera et al. 
[1]). 

For such biometric purposes the appropriate scores are defined as 
the scores whose linear regression on the independent variable (dose) 
is a maximum part of the total variation of reaction scores. They are 
computed as the mean dose for each category or any linear transforma- 
tion thereof. 

Thus, if n; subjects that fall in the 7-th category of reaction are 
exposed to various doses, the sum of which is S(z;), the appropriate 
score (c;) for that category would be 


G; = b St.) +a (1.1) 
nN; 
where b is different from zero and a can be any rational number. 

The choice of a and b does not affect the information. 


2. If the subjects are exposed to two agents with doses x and z 
respectively, the problem of appropriate scores will consist in assigning 
a score system that will maximize the multiple regression contribu- 
tion to the total variance. 

The following notation will be used: 


N = Total number of subjects 


*This work was conducted under the sponsorship of the Commission on Immunization Armed 
Forces Epidemiological Board, and was supported in part under contract with the Office of the Surgeon 
General, Department of the Army. 
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n, = Number of subjects in the 7-th category 
S(x,), S(z;) = Sum of respective doses in the 7-th category. 
S(x), S(z) = The total sum of respective doses 
[x*] = S(x*) — (S(x))’/N 
[2"] = S(e’) — (S(e))’/N 
[xz] = S(az) — S()S(@)/N 
[27] = Do: S’@)/n: — (S(@))°/N 
[27] = Do: S’@.)/n; — (S@)*/N 
[x.z,] = >5, Sv) S(z,)/n; — S(x)S(2)/N 


The appropriate score system (y;) for the multiple regression is linearly 
related to each of the appropriate systems for the single regressions. 


y; = (S(x,) + BS@,))/n, (2.1) 


or linear transformations thereof. 
Consequently, we have 


Sy — x = [xy] = [xi] + Blrz,] (2.2) 
Sty — He = ley) = [xz:] + Ble] (2.3) 
Sy — 9° = ly] = [vi] + 26[x2.] + PE] (2.4) 


The multiple regression sum of squares is 


= [vy] fe") + ley} [a*] — 2[rylley] [xe] 
‘ [w"][z"] — [ae]}’ (2.5) 


6 = R/{y’]is then the fraction which should be maximized 
by a suitable choice of the coefficient 8. (2.6) 


Inserting (2.2), (2.3) and (2.4) in the expression for 6, differentiating 6 
with respect to 6 and equating the result to zero, we have 
B*(eillvel — beast") + BCEDe"] — [e2Ile*) a 
+ [2.2;][x7] — [xi][zz] = 0 


The two solutions for 6 inserted in (2.1) will give two score systems that 
represent a maximum and minimum regression variance contribution, 


respectively. It is easy to ascertain which of the two values of 6 will 
yield maximum information. 
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3. Example. Untoward reaction to combined diphtheria and tetanus 
immunization. The modern trend in the practice of immunization is to 
combine several antigens in one preparation. The advantage of fewer 
injections is, however, sometimes offset by the higher frequency of 
untoward reactions such as local swelling, fever, and other discomforts. 
This is particularly true in adults where it is often found that the degree 
of untoward reaction is related to the size of antibody response. Diph- 
theria and tetanus toxoids have been combined for a long time, and it 
has usually been assumed that untoward reactions to such combinations 
are solely due to previous exposure to diphtheria toxin or other less 
defined antigens in the prophylactic. The more the individual has been 
exposed to diphtheria antigen either through natural infections or 
through immunization, the higher is the ensuing antibody production 
as well as the probability of discomfort. Tetanus immunity is not 
acquired by natural infection and it has long been observed that a 
single injection of tetanus toxoid rarely induces reaction. However, 
in the last two decades more and more people have received several 
injections of tetanus toxoid, and it is to be expected that greater sen- 
sitivity to this antigen will occur. 

42 young adults were each give one injection of a combination of 
diphtheria and tetanus toxoid that contained 1 Lf unit of diphtheria 
and 5 Lf units of tetanus toxoid (Ipsen [3]). Reactions were recorded 
in the days following the injections and a blood sample was drawn four 
weeks after the injection for measurement of antibody concentration 
in the serum. It was estimated that about one-half of these individuals 
had previous exposure to either diphtheria or tetanus antigen or both. 
Table 1 presents the serum antibody titers for the two agents, arranged 
in pairs, and in four categories of observed reaction. The antibody 
titers are given in logarithms of antitoxin units per 100 ml. The titra- 
tions for tetanus were not carried below .1 unit per ml. (or 10 units 
per 100 ml.). Specimens with less than this amount are recorded as 1.0. 
For diphtheria the lowest observation was 0.01 units per ml. (or 1 
unit per 100 ml.) Specimens with this amount or less are recorded as 0.0. 

The biometric problem consists in assigning a score system to the 
reaction that will give maximum information on the relation of reaction 
to both antibody titers simultaneously. The biological assumption 
is that such reaction is positively correlated with the amount of antibody 
response in both cases. The bottom of Table 1 presents the statistics 
used for the computation of the co-efficient 8 that determines the 
relative importance of the two antigens. Inserting statistics in equation 
(2.7), we obtain the following expression: 


65.6338" + 263.3098 — 6.576 = 0 (3.1) 
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of which the two solutions are 
+0.025 
— 4.037 


B= 


TABLE 1 
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Diphtheria (z) and tetanus (x) serum antitoxin in 42 adults, 4 weeks after one injection 
of combined diphtheria and tetanus toxoids, by untoward reaction category. (Titers 
are in log antitoxin units per 100 ml. serum.) 


(A) (B) (C) (D) 
Local redness Malaise, fever Local and 
None or soreness etc. “‘systemic”’ systemic 
Diph. Tet Diph. Tet. Diph. Tet. Diph. Tet. 
0.0 1.0 0.0 1.0 0.8 0) ho} 2.8 
0.0 e@ 0.5 iL) 0.0 2.8 ihoal 3.4 
0.0 1.0 2.0 1.0 ioe 2.8 2.5 3.4 
0.0 1.0 0.0 1k 0.0 Boll 
0.2 0) 0.8 let Ors 3.4 
0.2 1.0 it 1.9 0.8 3.4 
0.8 1.0 RG 19) 1.6 3.4 
0.8 1.0 0.0 2.2 
0.8 a0) 0.0 2.2 
ial 1.0 0.0 2.5 
1.9 1.0 2.8 250 
2.0 1 1.3 2.8 
0.0 il} 2m 3.4 
Tek 1.6 
0.0 2.2 
0.0 2.8 
2 2.8 
1.9 2.8 
il 3.4 
A B C D Totals 
lO 19 13 wb 3 42 
S(2;) 28.9 25.0 19.9 9.6 83.4 
S(z:) 12.8 12.2 4.8 4.9 34.7 
SiG?) = BORA SA) = i527 
[x7] = 37.6714 [eel 18 27196 
[2] = 26.6012 [zi] = 2.6983 
[xz] = 8.7257 [viz] = 3.8524 
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The two score systems are now computed by inserting 8 into equation 
(2.1), the values for which are arranged in Table 2. However, it is 
more convenient to convert this system to linear transformations 


yi; = (y% — ya)/(Yo — Ya) (3.2) 


which have score 0 for the lowest category (A) and score 1 for the highest 
category (D) as shown in Table 2. The choice between these score 
systems depends on which system will give the highest information 
measured by @. Using the equations (2.2) to (2.6) we obtain 


for p = 0.025; 6 = 0.3643 
for 8 = —4.037; 6 = 0.0756 


Obviously, then, the positive value of 8 is the desired co-efficient since 
it gives the highest information. Also the score system is more “logical”’ 
since it follows the biological ranks of the categories. 


TABLE 2 
ScoRE SYSTEMS WITH MAaximMuM AND Minimum INFORMATION 
B = 0.025 B = —4.037 
Category Y; Y; Y; Y; 
A 1.538 0.00 —1.199 0.00 
B 1.947 0.24 —1.866 0.30 
(6 2.860 0.78 +0.075 —0.58 
D 3-241 1.00 —3.394 1.00 


4. Biological Conclusions: Using the appropriate score system, we 
can now express the expected degree of reaction on the basis of a known 
antibody response to diphtheria and tetanus respectively, by means 
of the multiple regression equation 


y’ = —0.1545 + 0.21462 + 0.00502 (4.1) 


It is obvious that the influence of tetanus antitoxin is much greater on 
the response than that of diphtheria antitoxin. As a matter of fact, the 
information obtained by tetanus antitoxin alone amounts to 0.3641 
while the maximum information obtained by combining the two inde- 
pendent variants amounted to 0.3643. The difference of less than 1 
per thousand is then the estimated additional information that would 
come from including the diphtheria antibody response in the expectancy 


equation. 
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This analysis has prompted our Biological Laboratories to conduct 
further investigations into the importance of the tetanus component of 
combined vaccines in respect to untoward reactions. 

The author wishes to acknowledge the assistance of Harry 1. Bowen, 
Ph.D., in arranging the immunization trial, and to Mrs. Hanna Syl- 
westrowicz, B.A., for performing the antibody titrations. 
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POLYMORPHISM IN SOME AUSTRALIAN LOCUSTS 
AND GRASSHOPPERS 


R. E. BuackitH 
Imperial College Field Station, Sunninghill, Berks., England 


INTRODUCTION 


The measurement of changes of form in closely related organisms 
has often been attempted by compounding pairs of characters into 
ratios, or by examining those transformations which secure the super- 
position of outlines of the organisms traced on some system of coordi- 
nates (Le Gros Clarke and Medawar [1945]). Attempts to make the 
coordinate transformation method quantitative, and the ratio method 
more general, have not been conspicuously successful. The limitations 
of ratios and their generalisation are illuminated by comparing them 
with discriminant functions in which characters are added or subtracted. 
A change of scale, to a logarithmic coordinate network, will transform 
the ratio A/B to the discriminant log A — log B. Here there are two 
characters, with equal and opposite weights, but the discriminant 
function could evidently be constructed of some such expression as 
X = B, log A — B, log B + 8, log C --- which is restored, on relaxing 
the scale distortion, to Y = (A*'-C*:/B*), which also serves to generalise 
the classical equation of allometric growth Z = k- A® (Teissier [1937]). 
Any number of characters may be included in this way, the weights 
being calculated to give the greatest possible discrimination between 
any pair of groups of organisms under comparison. 

The use of linear discriminant functions of the form X¥ = B,A + 
B.B + B,C + --- for the assessment of differences of form and size in 
quantitative anthropology, (Mahalanobis, Majumdar and Rao [1949]) 
and in the analysis of the changes of shape of silkworm cocoons, (Iraisse 
and Arnoux [1954]) to name only two examples with animals, has had 
sufficient success to suggest that a quantitative appraisal of changes of 
size and shape has in this way become available almost unnoticed by 
those interested in the classical problems of growth and form. These 
discriminant functions are vectors which define the direction in which 
two groups of organisms differ, so that where more than one pair of 
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eroups is under comparison, the contrasts between them may differ 
qualitatively as well as quantitatively. A change in the relative weights 
of the characters which make up the function alters the direction of the 
contrast between the groups. With each discriminant function there 
is associated a corresponding separation of the two groups. This 
disjunction is defined by the expression 


DD? = (8, dy 4-82 ds 03 Gat OOP eRe ket oe 


where D is the generalised distance between the groups. ‘This distance 
separates the mean position of each group in a space of as many dimen- 
sions as there are characters in the discriminant function. As for the 
rest of the expression, the 6-coefficients are the weights of the characters 
in the discriminant function. These coefficients are found by serially 
multiplying the differences (d;) between the mean values of each char- 
acter in the two groups by the rows or columns of the inverse of the 
dispersion matrix. The original dispersion matrix (A) consists of the 
variances and covariances of the characters, arranged in matrix form:— 


in which such entries as a,, along the leading diagonal represent the 
variances of the several characters, and the entries such as a,, represent 
the covariances, for example of the first and second characters. The 
determinant of this dispersion matrix has been described by Kendall 
[1946] as playing the same part in multivariate analysis as does the 
variance in the ordinary, univariate, case. The inverse of this matrix, 
A’, may be written out with entries c,; in place of the a,; of the dis- 
persion matrix A. These entries in A’ are expeditiously found by 
the method described by Rao [1952] but there is a variety of computa- 
tional arrangements advocated by different authors, for what is essen- 
tially the process of disentangling the correlations between the several 
characters. This process of finding the 6-coefficients is also used in 
multiple regression studies, in which the adjusted sums of cross-products 
between the dependent and independent variates replace the d’s of the 
present case. The further multiplication of the 6-coefficients by the 
d,’s gives the square of the generalised distance. 

Thus for each discriminant function there is an associated distance, 
so that when the separation of two groups of organisms is made as 
large as possible, for a given set of characters in the function, these 
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two groups will be separated in a definite direction in the multi-dimen- 
sional space. We may expect that, when biologically similar contrasts 
are made between groups of organisms (as for instance, between the 
sexes of several related species) these contrasts will be similarly oriented 
in the common hyperspace, even though the generalised distances 
between the sexes, ie. the degree of sexual dimorphism, may differ 
from species to species. These vector properties of discriminant func- 
tions have been so little used since Fisher [1938] described their applica- 
tion as of urgent importance twenty years ago, that Goodall [1954] is 
almost the only author to make explicit use of them, in this instance in 
phytosociological investigations. The arrangement of groups of organ- 
isms, separated by the generalised distances between each pair of 
groups, in an appropriate hyperspace has been called discriminatory 
topology, and the descriptions of the method accompany the anthropo- 
metric investigations of Mahalanobis, Majumdar and Rao [1949], of 
Rao [1952] and of Mukherjee, Rao, and Trevor [1955], all of whom used 
generalised distances mainly as indicators of group separation, as did 
Hughes and Lindley [1955], without paying much attention to the 
direction of separation. 

There is, nevertheless, a closely related method described by Rao 
[1952], by which the several groups may be depicted on a chart, so 
that the underlying relationships between their forms can be exhibited. 
Representing, as before, the pooled dispersion matrix as A, we introduce 
the matrix B which represents the dispersion of the different groups in 
the hyperspace just as A represents that of the individual organisms 
about the mean for their group, the determinantal equation: 


|A—aB|/=0 


may be solved to give as many solutions for the latent roots as there are 
characters measured. Tach of these roots is associated with a vector, 
which generates an axis along which the position of the group may be 
plotted. Each of the vectors or canonical variates associated with 
the latent roots is orthogonal, so that, provided that only the first 
three roots account for important fractions of the disparity between 
the groups, a solid model may be made of their mutual relationship, 
and, for two important dimensions, a planar representation is possible. 
This method should not be confused with the evaluation of the latent 
roots of the characteristic equation of the dispersion matrix, 


lye te ¥) 


where I is the corresponding unit matrix with each entry in the leading 
diagonal unity and the remainder zero. The roots are then the latent 
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roots of the dispersion matrix (the eigenfunctions of statistical mechanics) 
and have associated with them the principal axes of the ellipsoid of 
individual points scattered about the general mean. The extraction of 
these principal components has close affinities with factor analysis 
(Burt and Banks [1947]; Teissier [1948]). 


AN ILLUSTRATIVE EXAMPLE 


To illustrate the practical application of the approach outlined 
above the discriminatory topology of some Australian locusts and 
grasshoppers is presented. 

Locusts and grasshoppers are noted for their marked plasticity of 
form, colouration and behaviour. The density at which locusts are 
reared modifies their form, and that of their progeny (Albrecht [1955]). 

The limiting structural types then represent the phases of locusts, 
(Uvarov [1921]) which may be distinguished from other and density 
independent phase phenomena as kentromorphic phases (Key and 
Day [1954]). The insects which develop from persistently crowded 
populations constitute the gregaria phase, those from dispersed popula- 
tions fall into the phase solztaria. The relations between the density 
and possible swarm formation are complicated, but the possibility of 
the prediction of swarming is so important that much effort has gone to 
discover ways of measuring locusts so that the cumulative influence 
of population density may be assessed. Usually, pairs of measurements 
are compounded in ratios, and this empirical usage has tended to obscure 
the intention of allometric studies to distinguish the underlying modes 
of growth of different parts of an organism. The object of this paper is 
to present the relationships which can underlie, not so much the measure- 
ments made on a homogeneous group of locusts or grasshoppers, as 
those made on different groups. The mutual relations of the form of 
different phases, species and sexes of these insects have important 
biological implications. 

The technique used here is to compute the generalised distances 
between every pair of groups. These distances may then be drawn to 
scale on a chart or as a three dimensional model, with the same limita- 
tions as noted above for canonical variates. 

In such generalised distance charts the vector properties of dis- 
criminant functions (Fisher [1938]) reveal underlying dimensions along 
which changes of external form can be assessed. The generalised 
distances between the two sexes of each species may be parallel to one 
another, but not to those generalised distances which link insects of 
the same sex but of different phases, unless phase differences are no 
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more than that exaggeration of normal growth reflected in sexual dimor- 
phism. The method may be extended to include differences of form 
between subspecies or higher taxa, provided that the measurements are 
homologous, and to include groups of insects with different ecological 
backgrounds. Such insects may reflect in their shapes the consequences 
of living in different habitats, as the ‘variation écologique’ of Pasquier 
[1938]. 

Just as studies of allometric growth have pioneered the investigation 
of underlying relationships among measurements made on a homogeneous 
group of organism, (Huxley [1932], Teissier [1937]) so comparisons 
between the shapes of different groups of organisms become possible. 
Notwithstanding the qualitative methods of D’Arcy Thompson and 
their ramifications (Le Gros Clark and Medawar [1945]) the develop- 
ment of quantitative techniques for inter-group comparisons of form 
has followed different lines of thought (Cousin [1948]: Anderson [1953]). 
Such methods often stem from the concepts of factor analysis and there 
is evidence that factor analysis and principal component analysis, 
applied to the variation of form in arthropods, yield essentially similar 
results (Teissier [1955a]). The extension of techniques using generalised 
distances to include differences of taxonomic rank is straightforward, 
Jeading to a three-dimensional structure if the differences between the 
sexes, which are primarily a matter of size, and those between the phases 
or the species or subspecies are each qualitatively distinct reflections 
of underlying modes of growth. Such virtually orthogonal relationships 
between the representation of size, phase, and specific variation have 
been found by Albrecht and Blackith [1957] among some African 
species of locust. 

Such relations are found by inspection of the generalised distance 
charts after all the groups of locusts have been located. The emphasis 
in this investigation concerns the vector, rather than the scalar, proper- 
ties of the individual comparisons between the groups of locusts and 
grasshoppers. Not only are generalised distances appropriate for this 
purpose, but by concentrating attention on the direction rather than 
the magnitude of the separation afforded one evades the problem of 
whether the material studied exhibits the full range of morphological 
plasticity of which it is capable. 


EXPERIMENTAL MATERIAL 


This paper reports a new analysis of the measurements of some 
Australian locusts and grasshoppers given by Key (1954), who has used 
methods customary in the study of locust morphology, namely the 
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comparison of individual characters and their ratios. The characters 
employed here are listed in Table 1. 

Two genera are represented, Chortoicetes Brunn. and Austroicetes 
Uv. of which the latter has several species, some rare, one of which is 
further divided into subspecies. 


A. arida Key (100° 0%, 72 Q) 

_ vulgaris vulgaris Sjést. 1007, 102 2) 
_ vulgaris corallipes Sjost. (970%, 102 2) 
A. nullarborensis Key (3007, 362 2) 

A. cruciata Sauss. (29070, 292 9) 

A. tricolor Sjést.. (60°, 109 2) 

A. pusilla Walk. (100° o', 109 9) 

A. tenuicornis Key (10, 29 2) 

A. frater Brancs. (100°, 102 2) 


hee nk 
Se 


A. nullarborensis and A. cruciata exhibit the phase polymorphism 
commonly found in locusts, that is to say their form and some features 
of their colouration are density dependent. They also exhibit non-phase 
colour polymorphism and sexual dimorphism in addition to phase 
polymorphism, but as more suitable material is available elsewhere for 
assessing the morphometric consequences of colour polymorphism this 
topic will not be pressed here. 


TABLE 1 


SEPARATION (D2-VALUES) OF THE PHASES By DrscRIMINANT FUNCTIONS Com- 
POUNDED OF 1-6 CHARACTERS 


Species :— A.cruciata (Easternrace) <A. nullarborensis. 
Character Males Females Males Females 
Elytron length 8.25 0.07 3.72 0.50 
Posterior femoral length 8.27 0.08 3.73 0.73 
Headwidth 11.83 0.25 4.25 1.81 
Pronotal length 12.98 3.43) 4.80 1.85 
Pronotal width 12.98 SeOD 5.40 1.92 


Pronotal height 12.99 3.58 5.50 1.92 


The single species of Chortoicetes, C. terminifera (Walk.) has two 
races, morphologically distinguishable, which inhabit respectively 
South-Western Australia and most of the remainder of the continent. 
These forms are called the ‘southwestern’ and ‘eastern’ races, and both 
exhibit phase polymorphism. Complete sets of 6 measurements of 
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640°” and 682 2 Chortoicetes were available for analysis and the 
available numbers of Austroicetes are given in parentheses after the 
specific name, though Key had many other incomplete sets on which 
his conclusions are based. For several rare species the numbers are 
small; many will prefer to discount such tentative results, others may 
prefer to see how even rare material may be used to give tentative 
results, which process is, after all, no more than usual taxonomic practice. 
The main interest of the analysis centres on the relatively abundant 
C. terminifera, A. nullarborensis and A. cruciata. 


METHODS OF ANALYSIS 


The method used here was to prepare dispersion matrices for each 
species for which 10 or more of each sex-species-race-phase category 
has been measured. These matrices were then pooled, and the inverse 
of this pooled matrix was linked to the differences between the mean 
values of the pairs of categories as described in the introduction. From 
the generalised distances so computed the charts of Figs. 1 and 2 were 
prepared. Fig. 1 is a projection, Fig. 2 a photograph, of the models 
made by cutting glass rods proportional in length to the generalised 
distances, and fitting these rods into rubber balls representing the 
groups of insects. 

The fact that the models form a three-dimensional framework, 
whereas the hyperspace is six-dimensional, may be taken to indicate 
that not more than three underlying dimensions of variation are impor- 
tant. The discrepancies in the positions of the balls located by more 
than one set of generalised distances were small enough to be considered 
as sampling variation, and accommodated by adjusting the position 
within the balls at which each rod terminated. 


RESULTS 


We have to identify the three important dimensions of variation 
within the charts. The general ‘sex’ dimension is well established and 
even the rare species for which few individuals are available conform 
to it. This reflection of sexual dimorphism is essentially a distinction 
of size. The segregation of the sexes along this dimension is well brought 
out in the figures, the females being uniformly larger than the males in 
the species studied. 

In these charts, there is a marked three-dimensionality, and we 
have to identify, if possible, the remaining two sources of variation 
other than size. Density-dependent changes of form, as between the 
phases of locusts, are known to constitute one such dimension of varia- 
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FIGURES 1a AND 1b. 


GENPRALISED Distance CHarvT FoR Chortoicetes terminifera, BASED ON Srx 
CHARACTERS. 


a) Enp View or Cuart, Looxrne Aone ‘Size’ Axis. 
b) Srpr Vrew or Cuarr Looxine Aone ‘PHASE’ Axis. 


SOUTH WESTERN RACE. 
SWC. terminifera 


HASTERN RACE. 


SS, C. terminifera from cooler, moister areas where swarms never occur. 

SS. Solitary insects of Eastern race, probably of solitary parentage, but from 
warmer, drier area. (SS; and SS: are of ph. solitaria). 

SG Solitary insects, probably from gregarious parents. 

GS Gregarious insects, from solitary parents. 

GG Gregarious insects, from gregarious parents. (ph. gregaria). 


tion (Albrecht [1955]). Within the genus Austroicetes the planes 
defined by the generalised distances between the phases and sexes of 
A. nullarborensts and A. cruciata (Eastern race) are nearly parallel. 
Moreover, the generalised distances which link corresponding groups 
of the two species are roughly at right angles to those planes linking 
the phases and the sexes, showing that the interspecific difference of 
form is not an exaggeration of sexual dimorphism or phase variation 
to be found in the common ancestor. 

The chart for the genus Chortoicetes (Fig. la & b) reveals that the 
‘phase’ dimension of the chart is delimited by groups of locusts which 
are, respectively, reared in comparative isolation and bred from dispersed 
parents, or reared in crowds and bred from crowded parents. There 
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are also two intermediate categories; locusts from crowded parents 
but themselves reared in comparative isolation, and locusts whose 
parents were dispersed but which, themselves, have been reared 
crowded. These intermediate forms lie between the two extremes 
(Fig. la). The inference is that the full morphometric expression of 
phase status is not attained unless the parental density accords with 
that of progeny, and both are at extremes of the normal biological 
range. On the evidence of this investigation, the influence of parental 
isolation, working in the opposite sense to that of the crowding of the 
immature stages, reduces the phase status to a position about half-way 
between the extremes as represented in Fig. la. Quantitatively, this 
conclusion is tentative until the cumulative densities of parent and 
progeny populations can be established in the field with more accuracy 
than has been possible so far. This finding agrees with those of Albrecht 
[1955] (and private communication) who found that the parental density 
of Red and Migratory Locusts modifies the form of progeny so that 
unless both the parents and the progeny are crowded in immature life 
the latter will not attain full gregaria phase status. 

The charts help to explain anomalies arising from the use of ratios 
of characters. For instance, Key [1954] points out that the South- 
Western race of Chortoicetes terminifera, assessed in terms of the ratio 
elytron length/posterior femoral length, appears to be ‘super-solztarza’, 
in the sense that this ratio takes values for that race which are normally 
associated with highly dispersed populations of Chortozcetes even though 
the populations were in fact far from sparse. On the other hand, other 
ratios, of head and pronotal characters, gave contradictory results, 
and Key indicates that the differences between the races must be distinct 
from those which exist between phases. Fig. l(a & b) shows that this 
South-Western race is in fact displaced from the Eastern race along a 
dimension qualitatively distinct from the ‘phase’ or ‘sex’ dimensions. 
Some interesting features emerge from attempts to identify this third 
dimension. 

Considering only the Eastern race of C. termanifera, there are 
represented on the chart two samples of approximately the same phase 
status but from localities providing different habitats for the insects. 
The groups labelled SS; come from regions at once moister and cooler 
than the regions from which come samples SS, . Fig. 1b supports 
Key’s conclusion that sample SS, differs from the others perhaps 
partly through the influence of climate superimposed on differences in 
eregarization. We find that our third dimension encompasses alike 
differences of form between races and the lesser differences between 
insects from distinct habitats. Moreover, in Fig. 2 it is that dimension 
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FIGURE 2 


GENERALISED DisTancBr CHART FOR Austroicetes SPP. BASED ON Srx CHARACTERS. 


A. Austroicetes arida 


F. A. frater. 
L. A. cruciata solitarioid (Hastern race) 
Z. 1. cruciata gregarioid (Eastern race) 
Y. A. cruciata gregarioid (Western race) 
X. A. cruciate solitarioid (Western race) 
P. A. pusilla 
V. A. vulgaris corallipes 
W. A. vulgaris vulgaris 
T. A. tricolor. 
TE. A. tenwicornis 

A 


. nullarborensis gregarioid 
N. A. nullarborensis solitarioid 


normal to the ‘phase’ and ‘sex’ dimensions in which most of the specific 
differentiation within the genus Austroicetes lies just as in Vig. 1 the 
subspecific and ecological variation occupies this third dimension normal 
to those of phase and of sexual dimorphism. As the same six characters 
are measured for each genus, these share a common hyperspace of six 
dimensions, the two most important underlying dimensions being 
identifiable in common. This identification may be carried further by 
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linking the figures for Austroicetes and for Chortoicetes. In fact, the 
Chortotcetes lie underneath the Austroicetes with the males and females 
of each chart aligned. The differences between the genera are, in effect, 
exaggerations of those changes of form which accompany speciation 
and subspeciation in Austroicetes and ecological variation in Chortotcetes. 
This comprehensive dimension completes the identification of the three 
major manifestations of morphological plasticity of the species studied. 

Among the Austroicetes, A. arida is rather oddly aligned, and the 
solitaria phase of A. cruciata (Western race) has the generalised distance 
between the sexes aligned with the ‘species’ dimension. One might 
suggest that these insects were hybrids, as it is known that in gryllids, 
at least, the inheritance of form is liable to show marked sex-linkage 
(Cousin [1948]), were it not that the cytological and other evidence 
discounts this suggestion (White and Key [1957]). 


THE EFFICACY OF DIFFERENT MEASUREMENTS AS INDICATORS 
OF LOCUST PHASE STATUS 


Much research has been directed to the discovery of characters 
which reveal the phase status of locusts. Anderson [1953] has emphasised 
the importance of a careful choice of morphometric characters in botan- 
ical work. One depends on the information provided by a discriminatory 
analysis to verify the wisdom of this choice. 

Key’s choice of characters has been assessed by computing the 
successive D’-values which differentiate the phases of the Australian 
species he studied. Though there is some doubt as to whether the full 
expression of phase status is to be found in every instance, the total 
observed D’ contributed by each combination of characters is given in 
Table 1 for two of the species. Among the conventional characters, 
the head-width and pronotal length seem to be most useful with this 
species. In exploratory investigations one needs a battery of such 
characters in the hope that, between them, the several underlying modes 
of growth will be sufficiently illuminated. 


DISCUSSION 


There are two broad purposes served by the construction of general- 
ised distance charts. That emphasised by Rao [1952] throws into 
relief the proximity of groups represented on the chart; those separated 
by a small distance are more alike morphometrically than are pairs 
with a high value between them. An extension follows from the recog- 
nition that discriminant functions are vectors which can differ in both 
magnitude (D’-value) and direction as first noted by Fisher [1938]. 
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Locusts and grasshoppers have a well-marked sexual dimorphism 
which is represented on a generalised distance chart by a series of 
nearly parallel vectors joining the sexes. By including such size- 
sensitive characters as body-weight at eclosion in the discriminant 
function and noting the increment of the D’-statistic produced by each 
character relative to that produced by including the weight, the size- 
sensitivity of any character can be tested. Mutatis mutand1, the 
efficacy of any character as a discriminator of phases or species may be 
examined. The generalised distance charts for all the swarming species 
so far investigated are two-dimensional, confirming the qualitative 
distinction between modes of normal and of density-dependent growth. 

Further comparisons between the species, wherever these have 
been made, show that the specific differences of form are not, in these 
instances, an exaggeration of the density-dependent or normal modes 
of growth. This kind of evidence, albeit negative in character, is of 
taxonomic interest, and is available at those lower taxonomic levels 
for which the conventional classificatory processes are often least 
securely based on a phylogenetic foundation. One could use such 
methods to help elucidate such current problems as the gradual, or, 
alternatively, the saltatory, nature of speciation. 

To show that putative species differ little along the ‘species dimen- 
sion’, is not to demonstrate their identity. One could do as much for 
many pairs of well-defined species. The present illustrations are 
included to show one kind of relevant information which may be ex- 
tracted from multivariate analyses of form in locusts. Key [1954] 
figures a tentative phylogenetic tree of the locusts he studied, based on 
characters of taxonomic value, on ecology, and on distribution. The 
generalised distance charts may be regarded as a morphometric reflection 
of such a tree. The correspondence is encouraging. The genera are 
distinctly separated; tenwicornis is on both representations the A ustroz- 
cetes closest in affinity to Chortoicetes; and pusilla, cruciata (Kastern 
race) and nullarborensis are the Austroicetes least like Chortoicetes. 
The most serious discrepancy is that for the Western race of A. cruciata. 
Besides being internally inconsistent, in that the solitaria show a 
sexual dimorphism along the ‘species’ dimension, all four groups of the 
Western race are far removed from those of the Eastern race, for which 
feature no easy explanation seems available. The general success of 
the technique adopted in disclosing biologically identifiable dimensions 
of variation, despite the small size of many samples, may yet help to 
illustrate a use for discriminato1y analysis complementary to the current 
preoccupation with classificatory problems (Williams [1955]). 

T should like to thank Dr. K. H. L. Key for reading this note in 
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manuscript and for many helpful suggestions, and Miss Margaret 
Roberts for most of the computations involved. I owe the photographs 
and the projection to my colleague Mr. J. W. Siddorn. 
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EXAMPLES OF INTRA-BLOCK ANALYSIS FOR FACTORIALS 
IN GROUP DIVISIBLE, PARTIALLY BALANCED, 
INCOMPLETE BLOCK DESIGNS*'** 


CiypE Younc Kramer AND RaupH ALLAN BRADLEY 


Virginia Agricultural Experiment Station of the Virginia Polytechnic Institute, 
Blacksburg, Virginia, U.S.A. 


1. INTRODUCTION 


Incomplete block designs were developed to accommodate experi- 
ments wherein it is necessary that the number of experimental units 
per block be less than the number of treatments. These designs are 
used extensively in many fields of research. Factorial treatment com- 
binations were introduced so that the effects of several variables, together 
with their interactions, may be studied in a single experiment. Factorial 
treatment combinations are being used in many fields of research. 
Kramer [1] and Kramer and Bradley [2] have presented the theory 
allowing one to combine these two important concepts, incomplete 
blocks and factorials, to increase the utility of both. 

Partially balanced incomplete block designs were first discussed 
by Bose and Nair [8] and these designs have proved to be very useful. 
Bose and Shimamoto [9] later classified partially balanced incomplete 
block designs and, since then, Bose, Clatworthy, and Shrikhande [3] 
have provided a catalogue of such designs with two associate classes. 
Designs are given in the catalogue for block sizes 3 < k < 10 and 
replications r < 10 and when £, and £, , efficiency factors that will be 
further defined later, are not too different. Group divisible designs 
form a large and important class of partially balanced, incomplete 
block designs with two associate classes. In the catalogue, the param- 
eters for each design, the association matrices, and block layouts are 
given. Clatworthy [4] has considered designs with k = 2 andr < 10 
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Marketing Act Contract, No. 12-14-100-126(20). ae ‘ 
**A condensation of part of a dissertation by C. Y. Kramer submitted to the Virginia Polytechnic 


Institute in partial fulfillment of the requirements for the Ph.D. degree in Statistics. 


197 


198 BIOMETRICS, JUNE 1957 


and indicates how such designs may be easily written down. It is the 
objective of this paper to present examples of analyses for factorial 
treatments in group divisible, incomplete block designs and to summarize 
the basic general results obtained. 

Factorials in incomplete block designs, except as confounding and 
partial confounding have been developed, seem not to have been much 
used. Methods have been available for the use of factorials in balanced 
incomplete block designs since 1938, the date of publication by Cornish 
[10]. In addition, Harshbarger [7] considered a 2° factorial in a Latinized, 
rectangular lattice. In current statistical practice, there appears to be 
an increasing need for factorials in incomplete block designs with 
applications in both industry and agriculture. 

When this work was presented at a conference,* R. C. Bose noted 
that the designs in [3] were essentially developed for varietal trials 
and that this was why FE, and F, were not too different. He pointed out 
that, for factorial treatment combinations, designs with widely different 
values of #, and , become much more important, for it may be that 
much is already known about some factors and their interactions while 
others need to be more fully investigated. Then the factorial treat- 
ments may be assigned to the association matrix in accordance with 
these needs. Certain group divisible designs are disconnected and 
have been discarded and not catalogued on the grounds that they are 
not useful designs for varietal trials. These designs may be very useful 
for factorials, for then the factorial associations of the treatment com- 
binations act as links between blocks. 

In developing the theory for factorials, we have presented the 
analysis of variance with the treatment sum of squares as a function 
of the least squares estimators of the treatment effects. The sums of 
squares for factorial effects are also obtained simply in terms of these 
treatment estimators. We have thus somewhat reduced the number of 
parameters of the designs that are required and explicitly shown in 
the analyses. It will be seen that the analysis of variance, both for 
unrelated treatments and for factorial treatments, in group divisible 
designs is not more difficult than for simpler incomplete block design 
that are more frequently used. 

The results in this paper are more general than those obtained by 
Cornish. Balanced incomplete block designs may be regarded as a 
special class of group divisible designs, and we have noted the special 
forms obtained for balanced designs as developed by Cornish and 
included here for the convenience of the reader. 


*Industrial Experimental Design Conference sponsored b i ienti 
y the Air Force Office of S 
Research, N. C. State College, November 5-9, 1956. noe ss 
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2. GROUP DIVISIBLE, PARTIALLY BALANCED, INCOMPLETE 
BLOCK DESIGNS 
The properties of group divisible designs given in [3] are repeated 
here for convenience. These properties are: 


(i) The design has b blocks of k experimental units with different 
treatments (or treatment combinations) applied to the units in the 
same block. 

(i) There are v = mn treatments (v > k) assignable to m groups 
of n each in the association scheme (1), where we use a double subscript 
notation instead of the more usual single subscripts, such that treat- 
ments in the same group or row are first associates and two treatments 
not in the same row of (1) are second associates. 


Vis Vie ptt Vin 
Bei eat (1) 
V mi Fon pines Vey 

(iii) Each treatment has (n — 1) first associates and n(m — 1) 


second associates. 

(iv) Two treatments that are 7-th associates appear together in 
exactly A; blocks, 7 = 1, 2. 

(v) Given any two treatments that are 7-th associates, the number 
of treatments common to the j-th associate of the first and the k-th 
associate of the second is p', , and is independent of the pair of treat- 
ments selected. If P; is the matrix with elements pj, , 


gai 0 | =| 0 no? I 
0 nim — 1) (n — 1) n(m — 2) 


(vi) The inequalities, r > d, , rk — dw > O, hold. 
(vii) The design parameters are related so that (n — 1); + 
n(m — 1). = r(k — 1) orrk — Aw = r — A, + MA, — Ag). 


Subclasses of these designs are: 


Singular (S) ifr = A. , 
Semi-regular (SR) if r > d, and rk — dv = 0, and 
Regular (R) if r > A, and rk — dx > 0. 


We shall consider the class as a whole without subdivision. But first 
let us note that certain designs, familiar to the reader and described 
by Harshbarger [5, 6], are a subclass of the Semi-regular designs.* 


*The grouping of blocks to form near balance, rectangular lattices or Latinized, rectangular 
lattices only requires subdivision of the unadjusted block sum of squares (as in Table 1). 
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These are near balance, rectangular lattices and Latinized, rectangular 
lattices, and they may be defined in terms of integers K and @ asso- 
ciated with the parameters of the semi-regular designs as follows: 


y= KK =0),bh= K — QO r= hb = mH KO ne 
Aq = 0; and Ay == 1 


When X, = Az, the incomplete block design is balanced. The results 
of this paper apply to balanced incomplete block designs and simplify 
through use of the stated equality. Then treatments may be assigned 
to an association matrix like (1) in any desired way and with free 
choice of values of m and n such that v = mn. 

The model assumed for all randomized incomplete block designs is 


Yiges ==. = Ti a B, + Exjs 5) (2) 


where y;;, is the observation on V,; 1n block s if that treatment occurs 
in block s, uw is the grand mean, 7;; is the effect of V;; , 8, is the effect 
of block s, and e;;, is the usual normal random error with mean zero 
and variance o°, the various e’s being independent. Restrictions on 
the parameters in (2) are that 5°", >0"_, 7;; = Oand >*’_, B, = 0. 
The analysis of variance is given algebraically (with the exception of 
mean squares which are obtained in the usual way from sums of squares) 
in Table 1. The meaning of Table 1 is clear with the following definitions: 


TABLE 1 
ANALYSIS OF VARIANCE FOR THE GENERAL MopEn 
: Ref. 
Source IDeiic Sum of squares No 
r rk —r) aX 
Treatments ; eee 
(adj.) ar N, ee gy 2 (3) 
ee k : BID) 
¢=1 \G=1 
Blocks b 2 
A b—1 1 2 G 
dj. eB a (4 
(unadj.) i; x rp ) 


Intra-block 
error [u(r — 1) — b + 1]|By subtraction 


6 9 
2 G 
Total ru — 1 x, 2X x Yiis — ae (5) 


in 
8 
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G2 (6) 
B; = 2D x Yiis 5) (7) 


D 
& 


ti; = [AvAoT  ; ; = I{Xe = hy) Ss T= VB; ;. 
f=1 


(8) 
+ (2 — a) DO Bin. J/vror + rk — 1) 
h=l 
and ¢;; is the least squares estimator* of r;; in (2), 
T 5; = > Yiie ? (9) 
with 
(i,7) 
and 
B;;. = 1 JES. 5 (10) 
with 


(74,7) 


*For a near balanced or Latinized design, 


ti; = [K(K — Q)'T,, — (K — Q) 2) Ta — K(K — QB. 


i Dy Bin./KAK 1) een) 


Ke. K(K aa Q) aul = = Tin 
= —O-n 7 LRK—-O-D 
h¥i 


4 Bai. i G 
TOK ee KK i) We 0) yh) 2 
the latter result being in the usual textbook form. Now (3) becomes 


ee ES ane > (24). 


fina 


For a balanced design, 
ts; = (kT; — B;;.)/A + rk — 1) = (AT; — B;;.)/vr 


and (3) becomes 


since then } +7k —r = 2. 
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More simply, G is the grand total, B, is the block total for block s, 
T; is the treatment total for V;; , and B;;. is the total of block totals 
for blocks containing V;; . Convenient computing organization for 
obtaining the entries in Table 1 is indicated in the numerical examples. 

Certain variances and covariances are useful. In the following 
results, s’, the error mean square obtainable from Table 1, is used 
as the estimator of o. We note that 


ae (n — 1) Cs D| 11 

UBC Ny Ee + rk — 1) MMW uy) 

me (m =, 1) és iI | : a 12 

Cov (ibis) = ka | MNXsv mA, + rk — r)\ 1A I, 12) 
and 

Cov (t5t:-;.) = —ke’/mnd., PSS, (13) 


V,,; and V,.,;- are first-associate treatments if and only if 7 = 2’. Then, 
Viti; — ti) = 2ko?/O1 = rk =e f)), j — ie (14) 


and 
Vis par ta) = Qko (vy + Av aaa No) /vro(Ay os rk a ‘is 1 - Oe Ons) 


The efficiencies #, and FH, , noted in the Introduction, depend on (14) 
and (15). These efficiencies are obtained by taking the ratio of the 
variance of the treatment contrast for a randomized block design to 
the corresponding variance for the incomplete block design, given 
equal values of r and on the assumption that both designs yield the 
same experimental error (that is, on the assumption that the use of 
the smaller blocks was not effective). The efficiency for the comparison 
of two treatments that are first associates is 


FE, =. + rk — 1r)/rk (16) 


and that are second associates is 


EH, = vds(. + rk — r)/rkQ, + Aw — rg). ake) 


The theory for the basic analysis of Table 1, and for the analyses 
for factorials summarized in the next section, may be developed by 
use of the method of least squares. To indicate the procedure, we note 
that to obtain the adjusted treatment sum of squares in Table 1,338 
sufficient to evaluate the difference between the minima of the sums of 
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squares, 


b 
p> Dy a (Yiis ey Ti; = B.)? 


in 
é 


and 


2 oe 2D (Yiis at Boe 


the former minimized subject to the restraints on the 7;; and the 8, 
and the latter subject to the restraint on the 8, . Similar differences 
are required for sums of squares assignable to factorial effects and detail 
on the theory is given in [2]. 


3. FACTORIALS IN GROUP DIVISIBLE DESIGNS 


We first consider what we regard as the basic two-factor factorial 
in group divisible, partially balanced, incomplete block designs. The 
analyses for multi-factor factorials follow from this. 

Consider factors A and C with m and n levels respectively. We 
amend the model (2) to obtain 


Yiis == te a; = aE: a 65; = G; = €ije (18) 


with the restrictions, 
™m 


> a; = 0, De 0, Dear =: 0s 
i=1 j=1° 


n b 
io pe) and pes CAO: 
j7=1 e=1 


a; is the effect of the 7-th level of the A-factor, y; is the effect of the 
j-th level of the C-factor, and 6,; represents the interaction of the 7-th 
level of A with the j-th level of C. ¢;;, is as previously defined. The 
new model (18) is obtained by setting 7;; in (2) equal to (a; + y; + 6;;). 
The treatment V;; has now become the factorial treatment combination 
A,C, and such substitution may be made in the association matrix (1) 
if desired. 

The least squares estimators for the factorial effects in (18) may be 
expressed in terms of the original treatment estimators ¢;; in (8). We 
now have, using latin letters for the corresponding parameters, 


Ole = a a ti; = Ae ) (19) 


eet. (20) 
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and 

dij; = ti; — 1. — ty (21) 
The appropriate analysis of variance is given in Table 2. In Table 2, 
the sums of squares (22), (23), and (24) add to the treatment sum 
of squares (3) in Table 1, and consequently it is easier to obtain the 
adjusted AC-interaction sum of squares by subtracting the total of 
(22) and (23) from (3).* Sums of squares for factorial effects in Table 


TABLE 2 


ANALYSIS OF VARIANCE FOR THE Two-Factor FACTORIAL 


Ref. 
Source IDs Sum of squares no 
A-factor NAW Cr 32 
= 1 U3. 
(adj.) me ae (22) 
C-factor mA, + rk — 17) G2 
: = t; 
(adj.) ieee k eS, “i (23) 
(Nite thet) 
AC-interac- k 
tion (adj.) (m — Im — 1) 
m n he . (24) 
aS) : (t;; =e bi. as inae 
i=1 j= 
Blocks ‘ 
(unadj.) be 1 ee @ 
er a 3) 
Intra-block |[v(r — 1) — 6b + 1]| By subtraction 
error 


b GE 
Total ry — 1 DS ye aS Yiis Sagi 
s=1 7 ; 7 


1B 
3 


*For a near balance or Latinized design, (22), (23), and (24) become respectively 
K-Q 


K 
i OD HS — 1) Sy OE 
j=l 


K(k — Q ~ 1) 538 Nea © 
Gee 


7= 


and 


For a balanced design, they become 


MN A 3: Mm ~~ ; WU ye 
UME OP il Ps @ ak Lay 


t=1 j=1 k t=1 j=1 


Tn & group divisible, partially balanced design, m and n are specified by the design; in a balanced 
design, m and n may be chosen in any way so long asv = mn. 
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2 are independent. (22), (23), and (24) respectively are used for tests 
of the null hypotheses, 


H(A): a1 


aa ea Am b] 
AC): afi cay tert dee Yn ) and 
H,(AC): Oi as = Ome ) 


against general alternatives in each case. 
The following variances are estimated by substituting the error 
mean square obtainable from Table 2 for Gg: 


; 2ka ee 
<a — ee (25) 
2 
; Qko* ; i! 
Ve; —¢) = mr, + rk — 1)’ Bes <: 
and 
y (d;;) le mn, + rk aang r) 
— _ —k(n = 10” eee 
Cov (d;; di;) = mn, + rk — 7)’ ae (27) 
— _—k(m — 1)0° Spey! 
Cov (d;; d:;-) = Pie eat a? eh, 
2 
et ko i X11, eee 


mn(r, + rk — 1)’ 


Efficiencies for factorial contrasts, similar to #, and EF in (16) and (17), 
are 


En hw/tk (28) 
for an A-factor contrast, 
Eo = ( +7rk — 1)/rk (29) 
for a C-factor contrast, and 
Exc = Qi + 1k — 1)/rk (30) 


for an AC-interaction contrast. Note that, for the singular subclass 
of two-associate class, group divisible designs, He = Eac = 1, while, 


for the semi-regular subclass, #4, = 1. For a balanced design, H4 = 
Eo = Hac = E, the efficiency of the design. 
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Single degree of freedom contrasts may be obtained in much the 
usual way. To partition the effects of the m-level A-factor into (m — 1) 
individual contrasts, each yielding an adjusted sum of squares with 
one degree of freedom, we have in effect only to make an orthogonal 
transformation on the m estimators, i. , --- , #,.. Let the u-th such 
contrast be 


== ee, : U4 = 1S ig (nL. (31) 
i=1 


(In the practice of analysis of variance, £,, , °** , mn, Will be a set of 
real coefficients that sum to zero.) The adjusted sum of squares for 
a test of the hypothesis that >", £;,a; = 0 is 


Adj. 8. 8. (L) = (et, 
ROSE. 


i= (32) 


In the same way, let the v-th contrast among C-factor effects be 


= De tiekas TANG, Hite Gy = oh) (33) 


and the adjusted sum of squares for a test of the hypothesis that 
Viet We, = 0 is 


Adi eS SAI ee (Sat mint, yi 
k oS Nie 


- thao (ys 7 i | 
k Da dots ck 


t=1 j= 


The adjusted interaction sum of squares may also be partitioned. The 
contrast for interaction of 7, and J, is 


Qu ge Cas ES. 
Adj. Ss. S. (Lod a= m >= > Enel vp (35) 
k > % an)” A a “) 


The way is now open to consider multi-factor factorials. Suppose 
the levels of the A-factor are themselves factorial combinations of g 
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factors, A, --- , A“, with levels m, , --- , m,,m = [[2-. m.. Then 
appropriate choice of contrasts J, , u = 1, --- , (m — 1), permits 
computation of adjusted sums of squares for all of the main effects and 
interactions among A“, --- , 4°”. In the same way, the levels of 
the C-factor may be factorial treatment combinations of factors 
c™,-+-,C™ with levelsn,,---,m,,n = [][*-,n,. Then appropriate 
choice of contrasts J, ,v = 1, --- , (n — 1), permits computation of the 
adjusted sums of squares for these factorial effects. The adjusted 
interaction sums of squares, Adj. 8.8. (J,./,), in (34) may be identified 
with interactions of main effects and interactions of A™, -.- , A” 
with main effects and interactions of C™, --- , C”. The computations 
for multi-factor factorials will be illustrated in the examples that follow. 

Note also that the levels of the A-factor may be associated with the 
treatment combinations of a fraction of a multi-factor factorial and so 
for the levels of the C-factor. Again appropriate choice of J, and J, 
leads easily to the required analysis. This also will be illustrated in 
one of the examples. 

The efficiency factor #4 applies to all main-effect and interaction 
contrasts among A“, --- , A, EH to those among C, --- , C™, 
and H4¢ to interactions between A-factor factorial effects and C-factor 
factorial effects. 


4, FIRST EXAMPLE, A GROUP DIVISIBLE DESIGN 


We now illustrate the applications of the methods outlined. Two 
examples will be given, one on a near balance design and one on a group 
divisible design that is not a near balance design. It was necessary 
at the time of writing to use made-up examples but they indicate how 
the methods may be used. 


(t) Basie Analysis 
We first consider use of the group divisible design S6 in [3] with 
design parameters, 


»=8 r=3, b=4, b=6, m=4, n=2, 4 =3, %»=1, 


and do the basic analysis of variance without regard to factorial make-up 
of treatments. The association matrix (1) now has four rows and two 
columns. Block lay-outs, observations, block totals, and grand total 
are shown in Table 3. To compute the ¢;; in (8), it is convenient to 
further summarize the data as in Table 4. We require 7; , 

2 7,,;, B;;,, and ));., B,;, and these may be obtained easily from 
Table 3; one may split the keyboard on a calculating machine, add 
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TABLE 3 
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OBSERVATIONS AND ToTALs FOR Desian S6 


Vu Vie Va V0 By 
29 38 40 33 140 
Vai V x2 Va Vian By 
28 37 24 24 1B} 
Vu Vie Vai Vx Bs 
20 37 26 33 116 
Va V2 Va Vaz Bs 
23 22 24 15 84 
Vu Viz Va Vac Bs 
27 Al 36 28 132 
Va V2 Va V0 Bs 
37 26 29 37 129 
Total G 
714 
TABLE 4 


VALUES OF T;; , Bij. , 3 IR 6 De IB, 5 UR pl 9 ean Udy 5 RD tg) 
i a 


FOR THE First ExamMpLe, Drsian S6 


V3; E, T ;; 


Val- || Values of 7 


4 1 2 

1 76 116 
2 100—s 81 
3 83 107 
4 84 8667 


2 
Bij. Dd By. 
j=l 
Values of 7 
1 2 

388 388 776 
303) 308 706 
358 358 716 
329 329 658 


Values of ¢.; 


Values of 7; 


tgy 


Values of j 

1 2 
Sally B67 
4.292 —2.042 
—1.250 6.750 
—0.542 —6.208 
—4.667 4.667 
Iho Ia iltoe 


—1.000 
2.250 
5.500 

—6.750 


—0.500 
1.125 
2.750 

—3.375 
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observations on one side to obtain 7';; , and add corresponding block 
totals on the other side to obtain B;;,. In the same way, >.?_, Ty; 
anos = Be may be obtained in one operation. We have also included 
values ¢;; , ¢;. , ¢.; , #;, , and #; in Table 4, for this is a convenient place 
to summarize these calculated values. Values of ¢;; are computed from 
(8) and, for design S6, (8) becomes 


1 
3 
To illustrate how the entries in Table 4 are obtained, we note that 


Ty, = (29 + 20 + 27) = 76, By, = (140 + 116 + 132) = 388, 


ss 1 ly 
T+ »— Bi. - = UB, 
By 2 ig Pal = ae DB. - 


ti; = 


2 2 

>> 71; = (76 + 116) = 192, >> B,;. = (388 + 388) = 776, 
7=1 j=1 

1 


fae il 
ti = 3 (76) + 5 (192) — = (888) — a (776) = —7.167. 


ul 
3 
Recall that 


poi ee een peeey. =e 
1. = 17 93 -7 a Hae 1. Dp Bis: 37) «7 4 oul 


t=1 

The sums of squares for the analysis of variance are now computed 
from (3), (4), and (5) in Table 1. Total and unadjusted block sums 
of squares are computed in the usual way. The adjusted treatment sum 
of squares (3) becomes for design S6 


4 Z 4 2 2 
Adj. Treat. Ss. Ss. = 3 De ea i = : 2 bs ts] = 552.88. 


The error sum of squares is obtained by subtraction, mean squares are 
obtained in the usual way, and the complete analysis of variance is 
given in Table 5 corresponding to the general Table 1. 


TABLE 5 
Basic ANALYSIS OF VARIANCE, Desian S6 
Source IDs S.s. M.s. F 
Treatments (adj.) 7 552.88 79.98 10.52 
Blocks (unadj.) 5 ADH OO Gee tae ss Ul) @ accsers 
Intra-block error 11 82.62 Hoh WN osname 
Total 23 BORO || Soca 4) pecans 


ee 
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(ii) Analysis for 4 by 2 Factorial 


We shall now suppose that the eight treatments resulted from a 
4 X 2 factorial associated with the treatment designations so that 
V;; = A,C; , A; representing the 7-th level of A,7 = 1, --- , 4, and 
C’; representing the j-th level of C, 7 = 1, 2. 

Adjusted sums of squares for factorial effects are obtained from 
(22) and (23) in Table 2. We have 


Adin So Si Ayman ier, geasl 87 


t=1 


and 


Adj. 8. S. (C) = 12 9) #, = 32.67. 


By subtraction of these two sums of squares from the adjusted treatment 
sum of squares in Table 5, we also obtain 


Adj. 8. 8. (AC) = 552.88 — (81.87 + 32.67) = 438.34. 


Alternate computation of the adjusted AC-interaction sum of squares 
may be used as a computing check and would be based on (24) in 
Table 2. 

We shall take the factor A to be a quantitative one and subdivide 
the adjusted sum of squares for A into linear, quadratic, and cubic 
components. This is done using the first form of (32). The trend 
coefficients together with the sums of squares of the coefficients are 
given in Table 6. To illustrate, the adjusted sum of squares for the 


TABLE 6 


TREND COEFFICIENTS FOR SUBDIVISION oF Ans. 8. S. (A) 


Coefficients for 


Contrasts kin S in, = — —— Sums of squared 
—0.500 1.125 2.750 —3.375 coefficients 
Linear A —3 —l +1 +3 20 
Quadratic A +1 -1 -1 +1 4 
Cubic A —-1 +3 —3 +1 20 


linear A-component is 


Adj. S. S. (Linear A) = oo 


-[(—3)(—0.500) + --- + (8)(—3.375))? = 9.80. 
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Similarly, 
Adj. 8. 8. (Quad. A) = 60.06 


and 


Adj. 8. S. (Cubic A) 


12.01. 


The AC-interaction sum of squares could also be subdivided if we 
desired; this has not been done. 
The analysis of variance for the 4 by 2 factorial is given in Table 7. 
Referring back to (25) and (26), we note that 


_ _(2)(4) : on ene 
V(a; — a;) = 2)(D(8) (7.61) = 3.75, 440", 4, 7 = 1) -+* 4, 
and 
Ve — ¢) = 4) (7.51) = 1.25. 


(6 + 12 — 3) 


Variances and covariances of a; or c; may be quickly obtained, if desired, 
from use of (11), (12), and (13) and the definitions of a; and ¢; in (19) 
and (20). 


(iit) Analysis for 2° Factorial 


Suppose that the 4 by 2 factorial of the preceding subsection is 
really a 2° factorial by taking the four levels of A to be made up of 
two levels of a factor N and two levels of a factor P. The complete 
association of the treatment combinations of the 2°-factorial with the 
treatments V,; is given in Table 8. We note only that in this new 
situation we require a new subdivision of the adjusted sum of squares 
for A and a subdivision of the AC-interaction of the last subsection. 

The contrasts required, each with one degree of freedom, are obtain- 
able most easily from the second forms in (32) and (34) and from (35). 
The required orthogonal sets of coefficients are shown in Table 8 and 
the resultant analysis of variance in Table 9. We simply illustrate 
the computing by considering the adjusted sum of squares for NPC- 
interaction. Here we have 


Adj. 8. 8. (NPC) = Ole 


-{(—1)(—7.167) + (1)(6.167) + --- + (1)(—6.208)]? = 13.50. 


(iv) Analysis for a Fraction (one-half) of a 2* Factorial 


Let us think of the eight treatment combinations of Subsection 
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(iii) as a half fraction of a 2* factorial. We introduce the new factor D, 
with two levels, and use the defining contrast, 


dipa—me Vite 


We can now associate the four-factor treatment combinations with the 
treatments V,, as indicated at the top of Table 8. The analysis of 
variance now obtained is identical with that of Table 9 but we note 
that confounding is present. The factorial contrasts of the 2° factorial 
have as aliases additional contrasts of the 2* factorial. These additional 
contrasts are shown in parentheses in Table 9. 


TABLE 9 
ANALYSIS OF VARIANCE AS A 23 FacToRIAL AND AS A HAur-FRACTION OF A 24 
FAcTORIAL 
Source Dene S.s. M.s. F 
Treatments (adj.) a 552.88 78.98 10.52 
NPCOD)= 1 1256 1356290) Be uleccer 
P (NCD) 1 20E25 P49) OAS) 2.70 
C (NPD) 1 32.67 ooNOt Al Sys 
NGG) 1 60.06 60.06 8.00 
INC. 21D) 1 iS), 1 8.17 1.09 
Oa GNED) 1 416.67 416.67 55.48 
NPC (D) 1 13.50 | eo0 1.80 
Blocks’ (unadj.) 5 495.00 eae 
Intra-block error 11 82.62 (fol 
Total 23 1130.50 


*Contrasts in parentheses are aliases of the stated contrasts when we have the half-fraction of 
the 24 factorial discussed in Subsection (iv). 


This fractional factorial is included to illustrate how they may 
be used in the incomplete block designs. The analysis of variance 
would be useful if, for example, we know in advance that the D-factor 
does not interact with the other three factors in the experiment. Inter- 
pretive difficulty would not enter except perhaps in the case of the 
D-factor main effect which is confounded with NPC-interaction. 
Interpretation of this contrast is easy if NPC-interaction may be 
assumed inconsequential; if this assumption cannot be made, the design 
can hardly be regarded as appropriate for the study of the additional 
factor D. We then should have selected a different defining contrast 
or recognized the necessity of using a complete factorial. Difficulties 
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in selecting the appropriate fraction of a multi-factor factorial for 
use in incomplete block designs are the same as those met in the use 
of fractional factorials in general and we shall not discuss them here. 


5. SECOND EXAMPLE, A NEAR BALANCE DESIGN 


In certain areas of application of experimental designs, lattice 
designs of various kinds have most frequently been used. Thus near 
balance, rectangular lattices and Latinized, rectangular lattices, while 
they fall in the more general class of group divisible designs, are better 
known than other designs of the general class. We have accordingly 
prepared an example that may be used to illustrate the use of factorials 
in the near balance, and Latinized, rectangular lattices and have 
already noted the special forms of formulas for near balance designs 
in general. A balanced lattice design is, of course, a balanced incomplete 
block design and the special formulas noted for balanced designs would 
apply. 

To use catalogued or derived near balance, and Latinized, rectangular 
designs, it is quite easy to write down the required association matrix 
by looking at the design itself. Alternatively, these designs are also 
catalogued in [3] together with their association matrices. They are 
listed among the semi-regular designs but are not designated as lattice 
designs and blocks are not necessarily arranged in replications. 

We think of a near balance design as a design with parameters 
specified in terms of K and Q as noted in Section 2. When the blocks 
are grouped into replications, the design becomes a near balance, 
rectangular lattice; when the blocks are grouped in a two-way pattern 
of rows and columns with each treatment in each row and column, the 
design becomes a Latinized, rectangular lattice. Our example is a four 
by three near balance design and is listed in [3] as design SR21. In 
this example K = 4, K — Q = 3,andQ = 1. Thenv = 12,r = 4, 
k = 3,b = 16,m = 3,n = 4,r, = 0,4, = 1. Weshow the plot layouts, 
the observations, the block totals, and the grand total for our illustrative 
example in Table 10. 


(t) Basic Analysis 

Table 11 contains the treatment totals 7; , the totals of block 
totals for blocks containing a specified treatment B;;, , sums of these 
quantities, and values of the estimators ¢;; with certain of their sums 
and averages. Table 11 is comparable to Table 4 in the first example. 
For this example, the association scheme (1) has three rows and four 
columns. 7; and B;;, are obtained from Table 10 as before; the 
estimates f;; may again be obtained from (8) or, if preferred, from the 
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OBSERVATIONS AND ToTaLs FOR Desicn SR21 


Vu Vis Viz By V2 Va Vas By 
15.5 15.0 16.0 46.5 21.5 22.5 16.5 60.5 
Va Voz Vos By Viz Vas Va Bio 
11.5 iO 13.5 42.0 12.5 16.0 12.0 40.5 
Vis Viz Va Bs Vu Vos Van Bu 
15.0 12.0 16.5 43.5 13.0 13.0 13.5 39.5 
Van Vis Va By Vis Va Viz By 
13.0 12.0 10.0 35.0 ©) 12.5 11.0 34.5 
Va Vas Vaz B; Vos Viz Va Biz 
22.5 19.5 Teo) 59.5 16.5 15.0 14.5 46.0 
Vu Vie Vaz Bs Vis Va Vue Bu 
14.0 15.0 13.0 42.0 13.5 19.0 12.5 45.0 
Vis Vn Va B; Vie Vu Vas Bis 
12.5 15.0 11.5 39.0 10.0 15.0 10.0 35.0 
Viv Vos Va Bs Vu Vee Vas Bis 
10.0 11.5 15.0 36.5 10.5 12.5 12.5 35.5 
G 


TOTAL | 680.5 


special form in the footnote 5. To illustrate, using the latter form and 


values from Table 11, we show 


ANE 


1x 1 
iy 2 39 Di; 7 32 LT a g Bui. a 


hei 
and 


96 7 


il 1 1 ie 
ti = 35 (63.0) — 35 (182.5) — 5 (163.5) + 55 (680.5) = —0.833. 


The adjusted treatment sum of squares may be obtained through sub- 
stitution in (3) or the special form in footnote 5, the unadjusted block 
sum of squares follows from (4), and the total sum of squares is obtained 
in the usual way from (5). The intra-block error is obtained by sub- 
traction as indicated in Table 1. We show the basic analysis of variance 


for this example in Table 13. 
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Blocks in Design SR21 may be grouped into replications so that 
the design becomes a near balance, rectangular lattice. Then the 
unadjusted block sum of squares may be partitioned into a sum of 
squares for replications and a sum of squares’ for blocks within replica- 
tions. When the blocks are grouped into rows and columns with a 
complete replication in each such row and column, we have a Latinized, 


TABLE 11 
Vavums Ord oy, Bio > Tay Do Biss tins fies 87 9 AND La 
i i 


FOR THE SECOND ExampLe, Disian SR21 


hae . 
Ts, Ds ffx Bij. DE, Bij. 
j=l j=1 
Values 
Values of j Values of j 
7 1 2 3 4 1 2 3 4 
1 53.0 61.5 73.0 48.0 || 235.5 || 163.5 171.0 185.5 160.5 || 680.5 
2 48.5 66.0 53.0 56.5 224.0 158.5 177.0 166.0 179.0 || 680.5 
Sede 0 5425 Oss0Nole Do) Ie220.0) i) 165,01 164.0) 79 Ol 17225) || 68085 
bij te, Es, 
Values Values of j 
a 1 2 3 4 
1 —0.833 1.417 3.917 —2.333 2.168 0.542 
2 —1.536 2.714 —0.786 —1.099 || —0.707 || —0.177 
3 —0.943 0.120 1.432 —2.068 |} —1.459 || —0.365 
Values of ¢.; —3.312 4.251 4.563 —5.500 
Values of 7. ; —1.104 1.417 1.521 —1.833 


rectangular lattice and the block sum of squares may be partitioned 
into sums of squares for rows, columns, and row by column interaction. 
These sums of squares, partitioning the unadjusted block sum of 
squares, are shown in Table 13 and the groupings of the blocks for 
the two special designs are indicated in Table 12. Subdivision of the 
unadjusted block sum of squares does not, of course, affect the rest of 
the intra-block analysis of variance. 
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TABLE 12 


Brock ARRANGEMENTS FOR Destan SR21 as A NeAR BALANCE, RECTANGULAR 
LATTICE AND AS A LATINIZED, RECTANGULAR LATTICE 


(Entries in the table are block numbers as given in Table 10) 


Near Balance, Rectangular Lattice 


Replicates 

I Il Ill IV 

5 15 12 3 

8 13 9 1 

7 16 11 4 

6 14 10 2 
Latinized, Rectangular Lattice 

Rows Columns 

I Il Ill IV 

I 14 2 10 6 
Il 16 4 12 8 
Il 13 1 8 5 
IV 15 3 11 U 


(77) Analysis for 3 by 4 Factorial. 


We may suppose that the twelve treatments of Design SR21 are 
made up of combinations of two factors, A and C, such that V;; = A,C;, 
a7=1,---,3,7 =1,--- , 4, with A, representing the 7-th level of the 
A-factor and C; , the j-th level of the C-factor. The analysis of variance 
is straight-forward, and may be effected through use of (22), (23), 
and (24) or the special forms in footnote 6. Values of #;, and 7; are 
given in Table 11 and only substitution is required. As before the 
adjusted AC-interaction sum of squares may be obtained by sub- 
traction (adjusted treatment sum of squares minus the total of adjusted 
A-factor and C-factor sums of squares) or by direct calculation based 
on (24). 

The analysis of variance for the 3 by 4 factorial is included in Table 
15. We shall also further subdivide the factorial effects in the next 
subsection. We conclude this subsection with the variance estimates, 


Oba = cae CEM inke B34. 
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TABLE 13 
Basic ANALYSES OF VARIANCE, Design SR21 


Source he S.s. M.s. F 
Treatments (adj.) 11 115.15 10.47 Ziel 
Blocks (unadj.) 15 311.91 20.79 eee 


Subdivision for Near Balance, Rectangular Lattice 


Replicates 3 12.94 4.31 
Blocks in Replicates | 12 298 .98 24.92 


Subdivision for Latinized, Rectangular Lattice 


Replicates (columns) 3 | 12.94 4°31 
Rows 3 20.31 77.44 
Rows by Columns 9 | 66.67 7.41 
Intra-Block Error 21 28.19 1.34 
Total 2 455 .25 
and 
(2)(3) ; , 
-—@.,) = : k =, (jy , 


obtainable from (25) and (26). These variances respectively apply to 
all contrasts among A-factor and C-factor effects and on this basis are 
appropriate for properly selected contrasts of the multi-factor factorial 
of the next subsection. 


(iii) Analysis for 3 by 2° Factorial. 


To complete the illustration of the use of factorials in a near balance 
design, we now consider the treatments to be the treatment combina- 
tions of a 3 by 2” factorial and subdivide the A-factor, C-factor, and 
AC-interaction sums of squares accordingly. We take the C-factor 
to be divisible into four combinations of two-level factors N and P. 
The identification of the treatments V;; in terms of these new factors 
is shown in Table 14. In that table we also give the coefficients for 
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VG [to ie Tats Tee Gaiz (G= 6= Gals i See IP [Sr (aes Vv 
"pend X dN 
8 late Les | T+ 0 0 0 0 c= Jets [ise [= VUulxXdN 
VG iS ile I+ Ge G+ oa oe oe t= t= (ee I+ |V PenoxXd 
8 ise Pete LS {k= 0 0 0 0 Lie ‘Lae ete lia Vulxd 
VG ie= Le os ie Gk ee Gir ies i (ee items I+ |VPREMOXN 
8 illsts cs late i 0 0 0 0 [eS I+ ae [Ise V Uy xX N 
Gl Date (es = ar SF lS iS ete i t= (i (Sr dN 
at lita Tate i= he ilar [Se LS L= (er 1ets ‘oe Lt d 
GI Tete ik let eee lar LS (lar cs I es [lets = N 
¥Z T= j= a i Ga Gata or o+ t= re i= I— |F oMeapeng 
8 ila ete Vai ete 0 0 0 0 i eee l= c= ‘Lo VP teourT 
S4SB1] U0) 
S20;GS Cri 1 OGL 0 Sr60> 660iTS 9820 > FTA SG OSGi SESiG— 2 On Gee Pa So 0 eS LO UTES 
= ve = £89 = a7) = 19 = ¥e9 = £09 = G9 = 19 = vp = £19 — org = 119 PUOUL}VOLT, 
8USTOYJI09 SUOI}BUSISOp 
perenbs [el10y0R Ty 
JO SUING "ENV fd’ N*V Tae NtV Fa NV “dene 8d NV "dN "d'Ny “aN ty “Ny tanh ONY, zo Aq € 
fA ey 18 RI Ben COAL Tey LA 1) oA 1A SJUSUI}VALT, 
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the indicated contrasts and the sums of these squared coefficients 
for each row. The sums of squares for the linear contrasts are computed 
as in complete block analysis except that additional multipliers are 
required as shown in (32), (34), and (35). The sums of squares for the 
contrasts used are given in Table 15. We have illustrated this type 
of computation in the first example and do not do so again here. 

Table 15 is sufficient to complete our illustration of the use of 
factorials in this second example. The unadjusted block sum of squares 
is not subdivided in Table 15 but the subdivisions of Table 13 may 
be used when desired. 


TABLE 15 
ANALYSIS OF VARIANCE FOR THE 3 BY 2? FACTORIAL 
Source Det. S.s. M.s. F 
Treatments (adj.) 11 115.15 10.47 7.81 
A-Factor 2 7.32 3.66 2.73 
Linear A a 6.57 6.57 4, ee 
Quad. A 1 0.75 0.75 srs 
C-Factor 3 71.20 23.78 feed 
N 1 1.39 1.39 1.04 
[P | 075 075 = 
NP 1 69.03 69.03 Olena 
AC-Interaction 6 36.63 6.10 4.55 
NX Lin. A 1 0.81 0.81 er 
N X Quad. A 1 22.76 22.76 16.99 
Pe ie A! 1 0.22 0.22 Ren 
P X Quad. A 1 5.94 5.94 4.48 
NP X Lin. A 1 5.17 Ls le 3.86 
IND! SMO VEEN I 1.72 72 1.28 
Blocks (unadj.) 15 ay oats 
Intra-Block Error 21 28.19 1.34 
Total 47 455.25 


6. DISCUSSION AND SUMMARY 


Incomplete block designs have been widely used in agronomic and 
animal experimentation but only to a limited extent, until recently, 
in industrial and engineering research. In the latter areas it may be 
that the frequent need to use factorial arrangements of treatments has 
in some measure precluded the use of incomplete block designs. In 
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this paper, we have shown how factorials may be used in group divisible, 
partially balanced, incomplete block designs and illustrated the appro- 
priate, and quite simple, method of analysis. 

Factorials in incomplete block designs should be widely useful. 
In agriculture, they appear to be particularly useful in animal experi- 
mentation, where it is only possible to obtain small groups of homo- 
geneous animals, as from a litter, and these may constitute material 
for an incomplete block. In large animal experimentation, twins may 
be used in some cases in designs with blocks of size two. In experiments 
involving subjective judgments, it is necessary to keep the block sizes 
small owing to fatigue and adaption effects. There are many situations 
where factorials in incomplete block designs will be useful in taste 
testing; the only limitation is that our methods require applications 
where scores, that may reasonably be used with an assumption of 
normality, are obtainable. In industrial research, it frequently happens 
that there are limitations on the number of samples or runs that may 
be obtained as a homogeneous group. Limiting factors are time, 
batch sizes, equipment, sources of raw material, and personnel to cite 
a few, and incomplete block designs are suggested to circumvent such 
limitations. 

In this paper we have shown how m by vn factorials may be intro- 
duced into group divisible designs with m classes of n items. The intro- 
duction of factorial treatments does not affect the fact that some of 
these designs, such as near balance, and Latinized, rectangular lattices, 
may incorporate grouping of the blocks into replications or into a 
Latin square. Harshbarger [4], in discussing rectangular lattices, 
suggested that the K(k — 1) series were of primary importance in 
that they provided treatment numbers well positioned between those 
available from square lattice designs. All series with K(K — Q) treat- 
ments become important in making factorials, with a wide variety of 
factor levels, available. It should be further noted that, while Q is an 
integer, it may in fact be either a positive or negative integer and 
designs with Q negative are included in the cited catalogue of designs. 

The adjusted treatment sum of squares for the group divisible 
designs was obtained as a simple function of the treatment-effect 
estimators in the development of the method for factorials. This, 
in itself, seems to be a useful new result and one that is helpful to an 
understanding of the analysis of such designs. The analysis for factorials 
was effected in terms of this form of the adjusted treatment sum of 
squares and in fact depends on a simple partitioning of that quantity. 
The main theoretical discussions are for an m by n factorial, but it is 
further demonstrated how multi-factor factorials may be used when 


INTRA-BLOCK ANALYSIS 223 


the factor levels are divisors of m or n. Single degree of freedom com- 
parisons have been made in much the usual way. Illustrations in the 
examples were chosen to show a variety of meaningful multi-factor 
factorials along with single degree of freedom comparisons. 

We did not consider how group divisible designs may be generated. 
Instead reference is made to a catalogue of designs and references 
there lead to methods of generating such designs. 

Recovery of inter-block information is being considered by R. E. 
Walpole working with the present authors. While this additional work 
is not yet available, considerable progress has been made. Research 
is also well under way on the problem of incorporating factorials into 
other partially balanced, incomplete block designs. Some of these 
other classes of designs are given by Bose, Clatworthy, and Shrikhande 
[3] and are the Simple, Triangular, and Latin Square types of two- 
associate class designs. Extensions to designs with more than two 
associate classes appear to be straight-forward and these designs may 
yield more flexibility for use with factorials. 

We have illustrated the method of analysis for factorials for two 
designs with numerical data. In doing this, we have carried out the 
basic analysis of variance in each case and then subdivided the adjusted 
treatment sum of squares into sums of squares for factorial effects. 
We have concluded the examples with the analysis of variance tables. 
It should be noted that, in analyses and reports on actual experimental 
data, care should be taken, as usual, to further summarize and interpret 
the findings of the experiment. Discussion of an experiment should 
not terminate with an analysis of variance table; we have not included 
interpretation of results because we were primarily interested in pre- 
senting new techniques of analysis, and interpretation follows in the 
usual way for factorial experiments. 

We are pleased to acknowledge the assistance of Boyd Harshbarger 
who offered helpful suggestions incorporated in this paper and closely 
followed the progress of this research. 
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REPEATED LINEAR REGRESSION AND VARIANCE COM- 
PONENTS OF A POPULATION WITH BINOMIAL 
FREQUENCIES 
CAC ailar 
Department of Biostatistics, University of Pittsburgh, Pittsburgh, Pennsylvania, U.S. A. 


1. The Initial Problem 


Consider the variate X = 0, 1, --- , 7, --+ , n, whose probability 
density function is 
2 : n! Ape 
ee ee ig 8 . 
It is well known that X = np and 2 = npg. Now suppose that there 
is a concomitant variate Y that takes the value Y, when X = 0, takes 
the value Y, when X = 1, andsoon. Therefore, f; is also the frequency 
of the paired values (X, Y) = (7, Y,;). Our problem is to find the linear 
regression coefficient of Y on X. The least square method leads to the 
solution that the slope of the fitted straight line should be 


2 
b= oxy/ox ) 


where oxy denotes the covariance of X and Y. Since it is already 
known that og = npq, our major task then is to find the value of 
Oxy = pa? = XO’, 


2. The Theorem 


Recall that the frequencies of Y, , Y, , --: , Y, are given by the 
terms f; of (¢ + p)". Now denote the successive differences 
(OG Se ANE OSG) Ere ee es!) 
by 
No ; ies wns, iba 


as illustrated in Figure 1. These h’s are the slopes of the segments 
joining two adjacent points. The theorem, to be proved after consider- 
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FIGURE 1 


Tue ReGRESSION OF Y ON X WHEN THE FREQUENCIES OF THE PAIRED VALUES OR 
Points ARE GIVEN BY ExpanpiInG (q + p)". THE DIFFERENCE h; = Yis1 — Yi IS 
THE SLOPE OF THE SEGMENT JOINING Two ApJACENT Pornts AS X VARIES BY UNIT 
Steers. Tue Story or THE LeAST SQUARE Line (Not SHOWN IN THE GRAPH) IS THE 
WEIGHTED MEAN OF THE h’s WHERE THE WEIGHTS ARB TERMS OF (q + pp)”, 


ing an example, asserts that the regression coefficient of Y on X is 
simply the weighted mean of the segment slopes, where the weights, 
' are terms of (q + p)”’; that is: 


ae i il A ne = 
b= ("7 ‘a CY a= ys) 


= Qiho + (vn — pq’ 7h, + ++) +p haa 


SS fhe 


REPEATED LINEAR REGRESSION 227 


3. An Example 


Let us take the example in which n = 5. Then the values of fi 
are the six terms of (¢ + p)’ and X = 5p and «2 = 5pq. The purpose 
of giving such an example at this stage is not so much to show the end 
result as to outline the method of proof employed in the next section. 
By following the steps of Table 1, much explanation may be saved and 
possible confusion may be avoided later. The first column of Table 1 
gives 


De {XY = 5p(q' Y, + Ang’ Y, = ees p Ys) = 5pY’, 
where Y’ is the weighted mean of Y, , Y,, --- , Ys , with the terms f; 
of (¢ + p)* as weights. Thus the covariance is 
xy oe EAGY a WOE = 5p(Y’ = Y) 


We can now ignore the factor X = 5p and concentrate on the part 
Y’ — Y. To facilitate simplification, we split each term of Y’ into 
two parts by multiplying it by (p + q) = 1. Take, for instance, the 
term involving Y; : 


6p°¢’ Y3(p + 9) = 6p’ ¢@ Ys + 6p'q’Y, . 


Then, as indicated in the middle column of Table 1, put the first term 
on the same line of the original term but put the second term in the 


TABLE 1 
CALCULATION OF THE COVARIANCE IN A (q + p)®> POPULATION. 
Product term Ignore 5p; Minus terms 
oD e split by (p + q) of Y 
GOY o = 0 Yi —@q Yo 
a 
5pglY1 = 5p-q’¥1 pay:  4pq‘Ys —5pq'¥1 
> 
10p7q32Y2 = dp-4pq*¥s 4peY, 6p@Ys —10p'¢’¥ 2 
a. 
10p*¢3Y3 = 5p-6p?q?Ys 6p'9GYs 4p%@ Ys —10p*? Ys 
+ 
5p'q4V4 = Sp-4p*qV4 ApqgYs pqYs —5p'qY 4 
+ 
poY; = 5p p*Ys pY s — pls 
Total: 5p-Y’ VY -Y 
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preceding line involving Y, . Finally, write down the terms of Y in 
the usual order in the last column. Then the value of (Y’ — Y) is 
the sum of the differences obtained by subtracting the corresponding 
terms (on the same line) of the last column from those of the middle 
column. Thus, we obtain 


VOY = g(V, — Yq) apg 0) 
= q{q‘ho + 4pq¢hi + 6p’Ghs + 4p*qhs + piha}. 
Consequently, 
oxy = 5p(Y' — Y) = 5pa{q’ho + --> +p hu} 
and 
b= oxy/Spq = qho + 4pq°h, + ++ + pha 


as asserted by the theorem previously. 


4, General Demonstration 


The following demonstration for the general truth of the theorem 
follows precisely the steps outlined in the preceding example. It is 
merely a rewriting of the relationship in a more general form. Maybe 
it is not the best proof that can be given, but it requires only the knowl- 
edge of a well known identity of binomial coefficients. The covariance 
for any integral value of n is 


Cea = ye = May 


n! 2 Sag = 
= Lit iP 9 tY; — npY 


( Sa 1)! i-1 n-i > 
=D q gee i 


Split each term of the summation into two by multiplying it by (p + q); 
thus the two successive terms involving Y,; and Y,,,; become, respec- 
tively: 


Cay ion-i Ca f—-1 n—s41 
Case emesis oe 8 


and 


(Gea) t+1 n-i-1y (n — 1)! 
Nie De q ast 


i 


REPEATED LINEAR REGRESSION 229 


The corresponding term in Y (which is to be subtracted) is 


pe n! igmiy 
a) | ie 
Collecting the three terms involving the same pg’ * and noting that 


n! _ (n — 1)! __(n— 1)! 
i(n — 2)! @—-Dlmn—1)! iWln—i-D!’ 


we obtain the term 


(n — 1)! Sst es ; Pees eke 
Wn —-—i— pi??@ CE a Ee of : \p'g Ike 


Summing such terms, we obtain the covariance 


Oxy = npq De fih; 


where f{ are the terms of (¢ + p)""". Dividing oxy by o2 = npq, we 
establish the theorem that the linear regression coefficient of Y on X 
is the weighted mean of the segment slopes h; where the weights are 
terms of (¢ + p)”** as stated in section 2. 

Having found the slope of the regression line, we can write down 
the equation of the straight line. Furthermore, we know that the 
portion of cy that is due to regression is boyy = b’cx , and in this case 
it is equal to npqb’. 


5. Repeated linear regression 


In order to continue the process of linear regression in successive 
stages, we need a more general system of notation. If Y is the original 
series of values, we shall use Y’ to denote the successive differences of 
Y so that Y6 = Y, — Yo, etc., which has been denoted by h in our pre- 
liminary theorem. Similarly, Yo’ = Yj — Y{, etc. The procedure is 
outlined in Table 2 for the case n = 5. An analogous system of notation 
may be adopted for the frequencies. Thus, f denotes the terms of 
(q + p)”, and f’ the terms of (g + p)" *, etc. Further, the mean of the 
original Y valuesis Y = >~ fY, and the mean of the Y’ values is Y’ = 
> f'Y’, etc. as indicated at the bottom of Table 2. What we have shown 
in the preceding section is that the linear regression coefficient in the 
original population (7.e. the (0)-column) is equal to the mean of the- 
next column to the right (z.e. the (1)-column). From the theorem 
we have demonstrated, it follows that the regression coefficient for the 
(1)-column is equal to the mean of the (2)-column, etc. Hence, the 
successive mean values Y’, Y’”’, --- , are all regression coefficients. 
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TABLE 2 
REPEATED REGRESSION FOR BINOMIALS 
(0) (1) (2) (3) (4) (5) 
VY f yew f’ Vas fe Vee fie yee Ree MAL ape 
Yo GP 
Mh GF 
Wa 5pqs Wh OF 
Yi 4pq Yo" sa? 
Yo 10p%q Yi’ 3p Ver eg 
Y; 6p? Yi" 2nq wi il 
Y3 10p*¢? Yi! 3p yy p 
Y; 4p%q Vege ene 
Wn 5ptg a p 
ye p! 
Wes p® 
VY Y’ y" yl yu yYv 


6. Components of Variance of Y 


We have already noted that one portion of the variance of Y is 
boxy = bc? = npqb” = npqY”. Now for the case of n = 5, for example, 
we note that the following expression is an identity: 


De SY ng lpg Ye 
ae 10p.g Yo" ae. Od ae a ae DG van 


For the general case of the (¢ + p)” population we have 
De sie aad Ve ie npqy” a ("prt = ("pater fo ae 


In other words, the variance 2 = >> fY* — Y* may be partitioned 
into n components. The first component is due to variation in the 
original Y values as expressed in terms of Yi; = Y,,, — Y,; where 

= 0,1, 2, --- . The second component is due to the variation of Y’ 
expressed in terms Y{’ = Y‘,, — Y{; or the variation of the original Y 
“in terms of Y,;,. — 2Y,,, + Y,. Similarly, the third component is 
due to Y;,3 — 8Y;42 + 3Y,4, — Y;,andsoon. In view of this property, 
we may speak of the first component as the linear component, the 
second quadratic, the third cubic, ete. 
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7. Outline of Proof 


For small values of n, the above identity may be directly verified 
by writing out every term of Y?, Y”?, Y’?, ... and seeing that both 
sides are equal. For the general case we need to show that the coefficient 
of Y; , after collecting the terms of the right hand side, reduces to 


fi = (") pq’ and that the coefficient of Y,Y, is zero (2 ~ 4). This 


will involve a great deal of writing. Presumably the identity may be 
established through more advanced mathematical tools which the 
author does not possess. The following indicates the method of a 
longhand proof and suffices to show that the coefficient of Y? is q’ 
and that of YY, is zero. 


Coefficient of 


Source ee Coefficient of YoY, 
ee ay 2npg 
npqy” npg” * —2npq¢"* + 2n(n — 1)p’q?"” 
“) pg yY'” 2) rg” —2n(n oe Weg Mies 
np” *q" Wye 
(with n — 1 primes) np a —2n(n — 1)p™*q"** + 2np"q” 
pg We 
(with n primes) py — 2np ¢ 
sum q" 0 


8. Corollaries 


From the general identity and the example given in Table 2 it is 
clear that some of the components of cy may be zero. For instance, 
if the Y’” values in the (3)-column are all zero, which implies that the 
Y” values are a constant, which in turn implies that the Y’ values 
form an arithmetic progression, then oy has only the first two com- 
ponents, while all subsequent ones are zero. 

This example, however, does not mean that oy always has the first 
k components and the subsequent n — k components vanish. A more 
general statement is that if the mean of a certain column is ZeY0, the 
corresponding component of the variance of Y vanishes. In particular, 
if p = q = 3, and the original Y values are symmetrical, such as 


Vien ee eS, a 2 
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for n = 5, it will be found that the first, third, and fifth components 
vanish and the variance of Y consists of only the second and the fourth 
components. Finally, as an extreme example, if the original Y values 
constitute an arithmetic series, so that Y’ is a constant, then oy = 
npqY”, having only one component. Further, if Y’ = 1, oy = np, 
which is the ordinary binomial variance. 


9. An Application in Genetics 

In a random mating autotetraploid population with random chromo- 
some segregation, the proportions of the-five genotypes with respect 
to one pair of genes are as follows: 


Genotype: aaaa Aaaa AAaa AAAa AAAA 
Frequencies : 7 Ang 6p’q pg p* 
Measurement: an NG Ws Wo Wa 


The Y values are the measurements of a quantitative trait (such as 
weight, vitamin content, oil content, etc.) of the various genotypes. 
If the trait under consideration is subject to random fluctuations, we 
take the mean value of the trait for each genotype as the “‘genotypic 
value” of that genotype and treat it as our Y. The variance of Y in 
a population is known as the ‘‘genotypic variance”’. 

Applying our theorem on the components of variance of binomial 
populations and putting n = 4, we may split the genotypic variance 
into four components (Li [1957]). The first component is due to differ- 
ences of the type Y, — Y, or Aaaa — aaaa. The second component is 
due to the values of the type Y, — 2Y, + Y,, or AAaa — 2Aaaa + aaaa. 
The third component is due to the discrepancies of the type Y; — 3Y, + 
3Y, — Y,,or AAAa — 3AAaa + 3Aaaa — aaaa. The last component 
is due to AAAA — 4AAAa + 6AAaa — 4Aaaa — aaaa. These 
components have been named the additive, digene, trigene, and quadri- 
gene components (Kempthorne [1954]). A numerical example is 
given in Table 3. The partition of the genotypic variance into such 
components will facilitate the prediction of the correlation between 
relatives, especially the parent-offspring and the sib-sib correlations 
which are useful in breeding work. A more detailed discussion of the 
subject would be out of place here. 
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TABLE 3 


GENOTYPIC VARIANCE CoMPONENTS OF AN AUTOTETRAPLOID F, POPULATION WITH 
Two ALLELES (p = q = 3). 


Y = Average Measurement of Genotype. 


Genotype yi f ee fe y" ee yu fre yu pi 


aaaa 8 Te 
12 t 
Aaaa 20 ts 16 ra 
28 3 —28 $ 
AAaa 48 as —12 2 16 i 
16 3 S19" 4 
AAAa 64 16 =2¢ z 
=) 1 
AAAA 56 vs 
mean Y = 43 Y’ = 17 Y" = —8 | Y= —20) Y"" = 16 
additive component = 4pqY" = 4(4)(4)(17)? = 289 
digene component = 6p? Y'? = 6(3)7(3)2(—-8)? = 24 
trigene component = 4pigsY""? = 4(4)3(3)3%(—20)? = 25 
quadrigene component = pig*Y’’”2 = (4)4(4)4(16)2 = 1 


Total genotypic variance = oy = sum = 339 


APPRECIATION 


In January 1957, with the final transfer of Treasurer’s duties, 
Dr. Chester I. Bliss terminated over nine years of sterling service as a 
general officer of our International society. 

I doubt very much whether the membership fully appreciates the 
extent of its debt to Dr. Bliss. The Biometric Society was organised in 
September 1947 at the Marine Biological. Laboratory in Woods Hole, 
Massachusetts, during the First International Biometric Conference. 
This conference was arranged on the initiative of the Biometrics Section 
of the American Statistical Association, and one of its objectives was 
to set international cooperation in biometry on an effective and enduring 
foundation. Many individuals were inspired by this objective, but it 
was Chester Bliss who, besides the inspiration, had the enthusiasm 
and driving energy so necessary to bring the plan to fruition. Bliss 
played the major réle in organising the Woods Hole meeting, and I 
believe it is fair to say that he, more than any other, was responsible 
for the foundation of the Society. His convictions were amply justified. 
Within two years, 6 organised regions were established, and the member- 
ship totalled nearly 900. Jour years later, 2 more regions had been 
added, and the membership had been increased by a further 25 per cent; 
today, there are 9 organised regions, and 6 other national groups, 
covering over 1300 members. 

Such rapid expansion made huge demands on Bliss, first as Secretary 
and later as Secretary-Treasurer, but he still found time to play his part 
in the organisation of three international biometric conferences and two 
international biometric symposia. This record of unselfish devotion to 
the Society speaks for itself. 

I count it a great privilege to express, on behalf of the membership, 
our deep appreciation and grateful thanks to Chester Bliss. 


E. A. Cornisu 
President 
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QUERIES AND NOTES 


GrorcEe W. SNEDECOR, Editor 


Arrangements of Pots in Greenhouse Experiments 


QUERY: The proper method of arranging pots in a greenhouse 
126 pot experiment has always been a source of worry and confusion 

to me. I am speaking specifically about greenhouse tests in 
which the pots are arranged in a randomized block design in order to 
secure yield, or other measurement data to be subjected to an analysis 
of variance. 

Is it proper, or is it improper, to periodically rearrange the pots in a 
randomized fashion within a replication? If the pots were field plots, 
it would, of course, be impossible to periodically rearrange them. Yet, 
it has been my observation that many agronomists periodically re- 
arrange the location of the pots within the replications. The argument 
of agronomists is that periodical rearrangement will reduce border, 
heat and light effects. Perhaps the rearrangements only spread these 
environmental effects throughout the entire replication. If this is true, 
is it desirable and does it lend to the accuracy of the experiment? 

Another question concerns the rearranging of the location of the 
entire replication. Are we justified in periodically exchanging the 
locations of the entire replications? I would certainly be happy to 
have any information or opinions on this problem. 


The query is partially answered by the following quotation 

ANSWER: [Designs of greenhouse experiments for statistical analysis, 

G. Cox and W. G. Cochran, Soil Science, 62, 87-98, 1946]: 

“Tn the foregoing experiments, the pots within the incomplete blocks 

were re-randomized at intervals during the early part of the trial. It 

is now believed that this was unnecessary. --- Unfortunately, we 

have found no data on the effectiveness of the practice: its drawbacks 

are the labor involved, the possibility of injury to the plants, and the 

opportunity for unobserved biases. The use of incomplete block designs 
should render the practice of less value --- .” 

In addition I would make the following comments. First, however, 

I would say that if there is any possibility of injury to the plants the pots 
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should not be moved around. With this qualification it seems to me 
that the two relevant aspects are (1) the validity of the experiment 
with relocating of pots and (2) efficiency of so doing. 

My opinion is that the validity of the experiment is unaffected by 
moving the pots around in either a prechosen way independent of the 
treatments or at random. We may regard the total error of a pot yield 
as being made up of the following parts: 


(1) errors of measurements of yield 

(2) errors peculiar to the pot, e.g. the soil or matrix, the plant, 
and possibly even a peculiarity of the pot per se 

(3) errors due to location of the pot in the greenhouse 


Parts (1) and (2) are irrelevant to or invariant in the consideration of 
your questions. As regards (3), we can visualize the total error as 
consisting of the total effect of deviations of the location of the pot from 
the average in the block for as many time intervals as we like. These 
we would expect to be non-additive in their effects on yield but we 
would expect that if a location gets more sunshine for example in the 
first month it will get more in the second month and so on. By moving 
the pots around I would expect that we would more nearly equalize 
the effects of location of pots among the treatments and hence reduce 
the experimental error. It might be argued that there would be some 
interaction with location in the block. This will, however, behave 
partially as random error and keeping the pots in the same location 
in the block will not make such an effect easier to discover. 

As regards efficiency we may envisage components of error variance 


as follows: oj (measurement), o5 (pot), a; (location). The total error 
will be wi 


2 2 
oi +o, +03 


Perhaps by moving the pots around one halves the last. component, in 
which case the total error would be 
ae 
2 2 3 
a; + o2 ey 


2 


You can visualize that if, for example, 
ai = 5, oo = 25, o= 5 


one will not achieve much greater efficiency by moving pots around 
particularly if a lot of work is involved. If on the other hand 


re ip op Sey o, = 50 
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and the moving does cut the third component in half, we would reduce 
the error variance from 80 to 55 so that we would increase the efficiency 
of the experiment by 


100(52 == 1) per cent or 45 per cent. 


We would need for example 4 replicates, with pots moved around, to 
obtain as much information as we would get with 6 replicates and no 
moving of pots. Of course some labor would be involved in doing the 
moving, which might better be spent on more replicates. 

On your second question of relocating whole replicates, there are no 
logical difficulties at all. We may for instance visualize block 1 as 
being on bench 1 for the first month, on bench 3 for the second month 
and on bench 2 for the third month. We would tend, it seems to me, 
to reduce block-treatment interactions by doing this. You may recall 
that an assumption necessary for the analysis of randomized blocks is 
that there be no block-treatment interaction, unless the blocks are a 
sample from a very large population of blocks (which is rare in my 
opinion). Clearly relocating blocks will result in blocks being more 
nearly alike which will tend to minimize the block-treatment interactions. 
We may also expect that this will at the same time result in a decrease 
in pure error variance because there will be a positive correlation 
between the location-within-block errors associated with any one pot 
during the time intervals unless the pots are moved. My opinion of 
your second question is therefore that to relocate blocks will almost 
certainly reduce error and make the usual analysis more nearly valid 
but the gain in efficiency may again be very slight. 

To close, one may suspect that if there were sizeable gains in net 
efficiency the procedure would have been used widely. On the other 
hand I know of no experimental work on the question. 

Oscar KiMPTHORNE 


Iowa State College 
Ames, Iowa, U.S.A. 
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Papers presented at the joint meeting of the Biometric Society (Brazilian Region) and 
the Brazilian Association for the Advancement of Science, Escola de Minas e Metalurgia, 
Ouro Preto, Minas Gerais, July 6, 1956 


C. G. FRAGA, JR., AND R. MEIRELLES DE MIRANDA 
402 Instituto Agronémico, Campinas). Analysis of a Non-Orthogonal 
Experiment. 


Methods for the analysis of non-orthogonal experiments are dis- 
cussed and the process of fitting constants is considered in detail. 
This process was used in the analysis of a feeding trial with chickens. 
Four treatments were compared and five replicates were used, each 
consisting of ten birds, except one, which had nine chickens. The 
ratio of males to females in each replicate was variable. The constants 
were computed by least squares and also by using a successive approxi- 
mation method due to W. L. Stevens (Biometrika, 1948). 


J. SANTOS DANIEL, ITA R. K. ABRAMOF AND T. SILVA 

403 (Instituto Agronédmico, Belo Horizonte). Statistical Analysis 
of the Number of Stomata in Coffea Under Different Experi- 
mental Conditions. 


The variation in the number of stomata was analyzed for 3 different 
parts of the same leaves, different leaves of the same plant, different 
plants of the same variety and 20 different varieties. The effects of 
shadowing and irrigation were also studied. The significance of main 
effects and interactions was evaluated by a series of analyses of variance. 
Significant variation was found among varieties and between different 
parts of the same leaves in some varieties.. The observed decrease in 
the number of stomata under shadowing was highly significant, and 
the interaction of this effect with that of variety was also significant 
for the 2 varieties thus studied, namely ‘Caturra’’ and “Bourbon”’, 
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although the main effect of variety was not significant in this particular 
case. 


AMERICO GROSZMANN (Servigo Nacional de Pesquisas 
404 Agronémicas, Rio de Janeiro). Ministerio da Agricultura 
Growth Rate Studies in Corn. 


Growth differences in ten day intervals between two inbreds and 
their reciprocal hybrids were studied in a split-plot experiment with 
three replicates, having ten dates of cutting. Each plot consisted of 
five single plant hills. Growth curves are difficult to analyze statistically. 
Several methods were tried to fit the basic purpose of the present study. 
R. A. Fisher in Séatistical Methods for Research Workers describes the 
percentage relative growth rate method which seemed to give the best 
answer for the analysis of the differences. In the first growth period 
there were no significant differences between inbreds, /’,’s and back- 
crosses. The F,’s exhibited a significantly lower growth rate. The 
inbreds grew most rapidly during this period. 


FREDERICO PIMENTEL GOMES (Escola Superior de 
405 Agricultura ‘‘Luiz de Queiroz’’, Universidade de 8. Paulo). The 
Analysis of Factorial Experiments in Balanced Incomplete Blocks. 


The author explains in detail how factorial experiments can be 
set up in balanced incomplete block designs, which permits the reduction 
of block size without confounding. 

The analysis can be carried out with the adjusted means, by the 
same procedure used for factorials in complete block experiments, 
but afterwards the sum of squares thus obtained should be multiplied 
by a correction factor. This factor is \v/kr for the case of intra-block 
analysis, where v is the number of treatments and the other letters 
have the usual meaning. 

When the recovery of inter-block information is carried out, the 
correction factor turns out to be 


Nv + (r = a 
kr i 
where, 
Freon (teed NE 
re TV; = Ve 


V, is the residual variance and V, is the mean square for blocks (ad- 
justed), and for Cochran and Cox’s type I balanced incomplete block 
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design, 
, - We =D = bg = DIV. 
TKE= OV, = OV. 
for type II (g is the number of groups of replications present in the design 
used) and 


a OG) VG 
~~ k(b-1)V,-—-W—b)V, 


a 


for type III. 

An example of analysis of a 2 X 2 X 2 factorial in balanced incom- 
plete block, with v = 8 treatments and b = 28 blocks of k = 2 plots 
per block is given. 


Papers presented at joint Biometric Society (ENAR) and I.M.S. sessions, Washington, 
March 7-9, 1957. 


406 FORMAN S. ACTON. The Mutual Difficulties of Statisticians 
and Digital Computers. 


The mutual difficulties of statisticians and digital computers arise 
from troubles in communication and a lack of generality in the formula- 
tion of statistical problems. The effort of programming and coding a 
problem for a digital computer is sufficient to require that a class of 
problems be included in any efficient routine. Statistical problems are 
still so heterogeneous as to resist this compacting. 

Machines obey languages that are unsuited to the accurate program- 
ming of complicated indicial arithmetic. Most statistical problems 
fall into this category. The obvious cures are suggested, with some 
guarded optimism being expressed. 


407 G. EK. P. BOX (Statistical Techniques Group, Princeton Uni- 
versity, Princeton, New Jersey). Iterative Experimentation. 


Scientific research is usually an iterative process. The cycle: con- 
jecture-design-experiment-analysis leads to a new cycle of conjecture- 
design-experiment-analysis and so on. It is helpful to keep this picture 
of the experimental method in mind when considering statistical prob- 
lems. Although this cycle is repeated many times during an investiga- 
tion, the experimental environment in which it is employed and the 
techniques appropriate for design and analysis tend to change as the 
investigation proceeds. 


Broadly speaking, one or more of the following four phases can be 
detected in most investigations: 
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(a) a screening phase in which an attempt is made to isolate the 
important variables; 

(b) a descriptive phase in which the effects of the variables and the 
positions of interesting regions of the space of the variables are empiric- 
ally determined; 

(c) a phase leading from (b) to (d); 

(d) a theoretical phase in which an attempt is made to understand 
the actual mechanism of the process studied. 


The roles which statistical methods should properly play in assisting 
the iterative process at these various phases of experimentation were 
briefly discussed. 


J. E. FREUND AND W. O. ASH. Some Estimation Problems 


408 in Generalized Harmonic Analysis. 


An important problem in the analysis of stationary ergodic Gaussian 
processes is the estimation of the power spectral density function, 
@(w) = 2/m {5 p(r) cos wr dr, where p(r) is the autocovariance function. 
The classical treatment of this problem is to estimate first p(r) for 
various values of + by systematic subsampling the process and then 
using numerical integration for an estimate @(w). While the classical 
estimator can be shown to be biased, it has nevertheless proved to be 
adequate in many applications. It has the disadvantage, however, of 
requiring a sizeable number of numerical calculations in order to produce 
a single estimate of 6(w). 

An effort to get an estimator substantially as good as (w) but 
requiring considerably less numerical calculation has led to a new 
estimator for the spectral density, 


mela) = 2 D7 altnlts + ANG), 


where the k, are independent random variables having equal probability 
distributions P(k;) and the G(k;) are weight functions. It is shown 
that the bias of ®*(w) can be made the same as the bias of @(w) by 
suitable choice of G(k;) and P(k;) and that the difference in the vari- 
ances of the estimators may be expressed as 


Var ®*(w) — Var &(w) = — all [p (0) + p(jAt)] K*(j) En i|, 


where K(j) = G(j)P(j). It was also discussed how to minimize the 
variance by a suitable choice of the probabilities and weights. 
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D. G. HORVITZ, G. T. FORADORI, J. MONROE, J. FLEIS- 
CHER AND A. L. FINKNER (North Carolina State College, 
Raleigh, N. C.) An Investigation on the Smoking Habits of 
Individuals. 


409 


An investigation was undertaken in cooperation with the American 
Tobacco Company to determine the most efficient method of measuring 
the amount of average daily smoking of individuals. This study was 
conducted in three phases. 


Phase I 


(a) Several semi-objective techniques for measuring average daily 
current smoking were tested. A counter attached to the case of 
a Zippo lighter, which is activated when the lid of the lighter 
is closed, was adjudged the most accurate among those tested. 

(b) Approximately 92% of the variability among individuals in 
average weight of cigarettes smoked per day is explained by 
average number of cigarettes smoked per day. An additional 
significant proportion of the total variability is explained 
by using separate regressions for each type of cigarette, i.e. 
70 mm, 85 mm and 85 mm filter. 


Phase II 


Several questionnaire measures were compared with the semi-objec- 
tive technique selected as the standard. The questionnaire having the 
following classification: 


(i) less than 10 
(ii) 10-20 
(iii) 21-40 
(iv) over 40 


is improved by rearranging the intervals so that the values 10, 20, 30 
and 40 cigarettes smoked per day occur at some point within the interval 
rather than at the end of the interval. 

For estimating average smoking per day for a specified group, there 
is some statitistical evidence favoring a questionnaire measure obtained 
by weighting the number smoked yesterday (a week day) by 5, number 
smoked Saturday by 1 and number smoked Sunday by 1. From a 
practical standpoint one question asking the number smoked, on-the- 
average may be equally as efficient. With the amount of data available, 


no difference in rates of misclassification could be detected among the 
questionnaire measures. 
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Phase IIT 


One-half of the permanent factory and stemmery employees of the 
American Tobacco Company were to be administered the questionnaire 
used by the Census Bureau in the 1955 Current Population Survey, 
and one-half were to be given a new questionnaire developed by the 
Institute of Statistics. Departures from this 50:50 ratio were due to 
random absenteeism. The completion rate of the schedules was 98.1%. 
Although it could not be determined which was the most accurate, 
there appeared to be real differences in the classification of individuals 
by the three questionnaire measures considered. 


CARL F. KOSSACK (Purdue University). A New Approach 


410 To General Purpose Sampling. 


A sequential design procedure for introducing several variables into 
a survey design so as to have the final design meet the individual 
requirements on each variable is considered in the case where a stratified 
sampling plan is used in which one stratum is a census. The problem 
of developing a population list which meets the coverage requirements 
for each variable is first resolved by preparing successive listings of the 
sampling units ordered in turn by each variable and tagging the units 
needed to meet the respective variable coverage requirement. Tagged 
variables are always kept at the top of subsequent listings. At the end 
all tagged units are kept in the population. In designing the sample, 
the census stratum is built up successively so as to assure that the 
accuracy requirement for each variable is met, using estimated sampling 
rates for the other strata. For the final variable an optimum computa- 
tional design is made using the built-up census stratum. If the sampling 
rates thus obtained differ significantly from the original estimates, a 
convergent process of design is recommended. The application of the 
procedure to a two-strata cluster sampling plan is discussed in detail. 


H. L. LUCAS (North Carolina State College and Princeton 
41 University). Some Uses of the I.B.M. 650 in Applied Statistics. 


Experiences of the author and his colleagues with the I.B.M. 650 
seem to fall into three classes: 
(i) analysis of data, 
(ii) evaluation of unwieldy expressions including iterative solutions 
of complicated equations, and 
(iii) empirical investigation of sampling distributions. 
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In analysis of data difficulties arise from memory limitations, diversity 
of types of data and desired final results. Some procedures used to 
partition large or complicated problems and to meet the varied demands 
were described. In class (ii) is an iterative scheme for finding the points 
of non-symmetrical rotatable designs and a method to arrive at the 
expectations of complicated quartic and higher forms. The latter 
was outlined. Empirical studies have been made or are in progress on 
genetic advance under selection with chromosome crossover, certain 
sequential procedures, a life testing problem, some facets of non-linear 
estimation, and Type I Type II errors in the unweighted means analysis 
of disproportionate data and the chi-square contingency test. Ieatures 
of these familiar to the author were related. 


DALE M. MESNER (Purdue University Center, Fort Wayne, 
412 Indiana). The Structure of Incidence Matrices of Partially 
Balanced Incomplete Block Designs. 


In a PBIB design with v treatments and s associate classes, define 
v X v incidence matrices A; = (aj,),7 = 1, --- , s;ai, = 1 if treatments 
uw and y are i-th associates, a, = 0 otherwise. These matrices were 
introduced by Bose. Their properties and several applications, some 
of them new, are reviewed here. Some of the applications were pre- 
sented by the writer at an earlier meeting (Ann. Math. Stat. 27 [1956] 
1185). The association schemes of designs of Latin square type with 
q constraints are defined by certain square arrays and have the property 
that fairly large sets of treatments are pairwise first associates. The 
matrices A; are used in a proof that for fixed g and sufficiently large v 
this property follows from the values of n; and p;, and in turn implies 
the structure of the square array. This constitutes a uniqueness proof 
of these schemes. 


M. A. SCHNEIDERMAN, R. J. TAYLOR AND S. B. FAND 

413 (National Institutes of Health, Public Health Service, Bethesda, 
Maryland). Some problems in the Clinical Trials of Anti- 
Cancer Agents. 


The design of clinical studies to evaluate anti-cancer agents requires: 


a. Sharp definition of the disease studied 

b. recognition of the special nature of advanced cancer 

ce. recognition of the problems of associated toxicity, and the 
problems of ‘cost’? in toxicity as a negative measure of the 
effectiveness of a material. The concept of the therapeutic 
ratio is discarded in favor of a response-for-a-given cost. The 
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approach permits the evaluation of combination therapy, while 
the therapeutic index does not. 


Problems of measurement are discussed and a presentation is made 
of the results of 


a. measurement by different individuals 

b. the handling of serially correlated data (which all tumor measure- 
ments are) from a maximum likelihood point of view, instead 
of the common least squares approach which assumes independ- 
ence. 


Sample charts of the ‘‘cost”’ approach, of examples of measurement 
problems, and a ‘Protocol Planning Guide” are given. 


RAYMOND E. VICKERY. (Agr. Est. Div., U.S. Department 
414 of Agriculture.) Recent Experiences with Area Sampling for 
Agricultural Statistics. 


Probability area-sample surveys were conducted by the Agricultural 
Estimates Division in June of 1954, 1955 and 1956 as part of an exten- 
sive research program aimed at developing an improved crop and 
livestock estimating system. They were conducted under practical 
operating conditions to provide a better indication of how those pro- 
cedures would behave when put to use on a large scale and also to train 
the nucleus of a staff for large scale operations. The program covered 
700 segments of about 4 farms each in the 10 Southern States in 1954 
and 1955. It was expanded to include 13 North Central States in 1956. 

The approach in 1954 was to enumerate every farm for which the 
operator resided in the segment. There was heavy over-reporting of 
certain crops, especially those grown on a share basis. In 1955 the 
questionnaire was designed to eliminate this over-reporting. A list 
of 1,000 “large” farms was also added to the area sample that year. 
Although the over-reporting was corrected in 1955, sampling errors 
were not reduced appreciably. 

In 1955, a separate sample of 100 segments was used to test the 
practicability of accounting for the use of all land falling wholly within 
the boundaries of sample segments with respect to both crop acreage 
and livestock inventories. This “closed-segment”’ approach appeared 
so satisfactory that it was used in the entire survey covering 1100 
segments in 23 States in June 1956. Results clearly indicate that land 
use items can be estimated accurately by the closed segment method 
provided the sample is of sufficient size. Further studies are being 
made of the feasibility of gathering livestock data on closed segments. 
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On the average, one closed segment yields at least as much information 
as two segments using the farm headquarters approach. 


R. LOWELL WINE. A Sampling Study of Sources of Informa- 
tion for Farm Families in Virginia. 


415 


A statistical study was conducted for the Virginia Agricultural 
Extension Service to determine which of its services, along with all 
other media of communication, have been most effective in reaching 
and helping the farmer with regard to two practices in the home. The 
“area sampling method” was used in the ‘‘open country’’ some within 
the West Central District of Virginia which includes sixteen counties. 
Two per cent of all farm families were interviewed in order to determine 
the source (sources) concerning the variety of his main field crop and 
type of fertilizer used on this crop and the homemaker furnishing the 
answers relative to her source (sources) concerning any major kitchen 
improvement and method of preserving foods for home consumption. 

Some of the advantages and difficulties encountered in such a 
study are brought out and in some cases suggestions are made for 
improving the questionnaire. 
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Australasian Region 


Among the papers given at the 32nd meeting of the A.N.Z.A.A.S., 
Dunedin, 16-23 January, 1957 were: Statistics Sympost um, G.S. Watson, 
“Chi-squared test for the goodness-of-fit of normal distributions”’; 
P. Whittle, ‘““Non-linear stochastic processes’; J. H. Darwin, “Some 
models of population growth’; W. M. Harper and J. A. Macdonald, 
“Distribution of the mean half-square successive difference in sampling 
from a normal population”’; Statistical Genetics Symposium, A. H. Carter, 
“On estimating heritability’; B. I. Hayman, “‘A representation of 
gene action”’. 


Brazilian Region 


The Brazilian Region of the Biometric Society held its 3rd meeting 
at the Instituto Biologico, Sao Paulo, January 15, 1957. Scientific 
sessions were held at 10 a.m. and 4 p.m. Papers given included R. A. 
da Silva Leme, On checking the presence of a positive lower limit of a 
distribution of tensile strengths through an incomplete block design; 
F. Pimentel Gomes, Analysis of a group of thirty-eight fertilizer experi- 
ments with sugar-cane; P. Mello Freire and M. Picosse, Use of normaliz- 
ing transformation in the analysis of anthropometric indices; C. G. 
Fraga, Jr. and R. Meirelles de Miranda, Covariance in a nonorthogonal 
experiment; A. Conagin, Estimation of the number of repetitions to be 
made in future experiments; A. Groszmann and Venita 8. Nascimento, 
Variance components in the interpretation of a series of biological 
data; F. Pimentel Gomes, Elementary proof of Scheffé’s test; and I’. G. 
Brieger, Analysis of contrasts. 

The annual business meeting was held at 3 p.m. Officers were 
reelected for the 1957 term: President—C. G. Fraga, Jr., Secretary— 
P. Mello Freire, Treasurer—A. Groszmann. Commission members 
elected in 1957 for three year term are A. M. Penha and A. Conagin; 
elected in 1956, F. G. Brieger (1956), L. Freitas Bueno (1956), J. M. 
Pompeu Memoria (1956, 1957) and A. A. Bitancourt (1956, 1957). 


British Region | 

At a meeting on March 7, 1957, the following papers were presented: 
M. R. Sampford, A linearly balanced design for dairy cattle experiments; 
G. G. Meynell, The inherently low precision of infectivity assays; 
G. A. Barnard, Why fix totals in tests on 2 X 2 contingency tables? 
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ENAR 


A joint meeting with the Institute of Mathematical Statistics was 
held in Washington, D. C. during March 7-9, 1957. Attendance at 
many sessions was upwards of 200. Among the papers presented were: 
A. Sample Survey Methodology: L. Wine, Sources of information for 
farm families in Virginia; C. Kossack, General purpose sampling; 
R. E. Vickery, Area sampling for agricultural statistics. B. Stochastic 
Processes: J. E. Freund and W. O. Ash, Generalized harmonic analysis; 
F. W. Diederick, Applications to aeronautic problems. C. Design of 
Experiments: W. H. Horton, Fractional factorials in industry; G. E. P. 
Box, Problems in evolutionary operation; C. Y. Kramer, Factorial 
treatments in group divisible incomplete block designs; D. M. Mesner, 
Structure of incidence matrices of PBIB designs. D. Electronic com- 
puters: H. O. Hartley, Reduction in programming by standardized 
routines; H. L. Lucas, Some uses of the IBM 650 in applied statistics; 
F. 8. Acton, The mutual troubles of statisticians and digital computers. 
E. Survey of Smoking Habits: D. G. Horvitz, G. T. Foradori, J. Monroe, 
J. Fleischer and A. L. Finkner, Smoking habits of individuals. F. Sta- 
tistics and Probability: J. Cornfield, Statistical inference; G. Noether, 
Nonparametric tests; EK. Lukacs, Analytic characteristic functions; 
N. Severo, Tests of the means of certain distributions. G. Public 
Health and Medical Statistics: M. G. Sirken, Collecting data from 
samples of recently deceased persons; M. A. Schneiderman, R. J. Taylor 
and 8. Fand, Clinical trials of anti-cancer agents. 


Région Frangaise 


A lAssemblée Générale, tenu le 13 février 1957, M. Michel Ollagnier 
a été élu au Conseil. M. André Vessereau a présenté une communication 
entitulée ‘Sur les conditions d’application du criterium x” de Pearson’. 


Italian Region 


A second course of Biometric Methodology was held at the Istituto 
Sierterapico Milanese from October 8 to 20, 1956. The program was 
similar to that of the first course held at Varenna the previous year. 
The following courses of lectures were given by L. L. Cavalli-Sforza, 
G. A. Maccacaro, R. Scossiroli and F. Sella: (1) Applied Statistical 
methods (11 lectures), (2) Design of experiments (5 lectures), (3) 
Bioassay (4 lectures). Hach lecture was followed by a half hour discus- 
sion and one and a half hours practical work with the students working 
in pairs. Further theoretical courses were (4) Health statisties (A, 
Tizzano), (5) Theoretical foundations (F. Brambilla), (6) Clinical 
statistics (G. Barbensi). 
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Most of the students were graduates in medicine, but some attended 
whose main interests lay in veterinary medicine, general biology, 
pharmacy and other fields. Of 60 applications only 32 could be accepted. 
It is planned to hold a similar course during 1957. 


MEMBERS 


The following notifications of changes of address and of location of new 
members were received during February-A pril, 1957. 


New Addresses 


Dr. Peter Armitage, Cancer Chemotherapy, National Service Center, 
National Institutes of Health, Bethesda 14, Md., U.S.A. 

Dr. Geoffrey Beall, Division of Manufacturing, Gillette Safety Razor 
Co., Boston 6, Mass., U.S.A. 

Dr. F. EK. Binet, Poultry Research Centre, Tarneit Road, Werribee, 
Victoria, Australia 

Dr. Archie Blake, 2133 N. Circle Drive, Ann Arbor, Michigan, U.S.A. 

Mr. James A. Bond, Dept. of Zoology, University of Chicago, Chicago 
37, Illinois, U.S.A. 

Mr. A. G. Constantine, C.8.I.R.O., Division of Math. Statistics, 
University of Adelaide, Adelaide, South Australia 

Robert Joseph Deam, B.P. Ltd., Beaufort House, Gravel Lane, London, 
England 

Mr. R. de Coene, I.N.E.A.C., Bambesa, Buta, Belgian Congo 

Alison Grant Doig, c/o Professor M. G. Kendall, London School of 
Economics, Houghton St. & Aldwych, London W.C. 2, England 

Mr. George E. Ferris, Apt. 46V, Building 8, 177 White Plains Road, 
Tarrytown, N. Y., U.S.A. 

Mr. Willis W. Frankhouser, Merck, Sharp and Dohme, West Point, 
Pennsylvania, U.S.A. 

Dr. J. M. R. Franckson, 21 Dieweg, Uccle, Belgium 

Dr. P. W. Geier, c/o Australian Scientific Liaison Office, Africa House, 
Kingsway, London, W.C.2, Great Britain 

Dr. Benson Ginsburg, Cobb Hall 215, University of Chicago, Chicago 
Bellies, USA: 

Dr. Mordecai H. Gordon, Box 546, Perry Point, Maryland, U.S.A. 

Dr. Theodore H. Greiner, M.D., Anderson Dept. of Psychiatry, Baylor 
University College of Medicine, Houston, Texas, U.S.A. 

Dr. J. W. E. Harrisson, Library, P.C.P. and §., 48rd and Kingsessing 
Avenue, Philadelphia 4, Pennsylvania, U.S.A. 

Brian Ivanhoe Hayman, Crop Research Division, Private Bag, Christ- 
church, New Zealand 
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Mr. J. A. Heady, Social Medicine Research Unit, Research Laboratories, 
London Hospital, Ashfield Street, London E. 1, England 
Mr. L. Harmon Hook, Apt. 23, 5427 University Avenue, Chicago 15, 
Illinois, U.S.A. 
Mr. Jay D. Leary, Jr., 70 Highland Street, Reading, Massachusetts, 
U.S.A; 
Dr. P. H. Leslie, Bureau of Animal Population, Botanic Garden, 
High Street, Oxford, England 
Mr. D. F. Matzinger, Department of Experimental Statistics, Box 5457, 
North Carolina State College, Raleigh, N. C., U.S.A. 
Miss Jean Miller, 12 Mill Lane, Cambridge, England 
Professor Per Nylinger, Skoghogskolan, Stockholm, Sweden 
Mr. Floyd R. Olive, U.S.0.M., c/o American Embassy, La Paz, Bolivia 
Dr. Bronson Price, 5813 Temple Hills Road, Washington, D.C., U.S.A. 
Mrs. Lila Knudsen Randolph, 8004 Riverside Drive, Cabin John Park, 
Maryland, U.S.A. 
Ingénieur Norbert Roussel, 15 Rue Combattants, Tirlemont, Belgium 
Dr. V. Sahleanu, Schitu Meguresnuss, Bucharesti, Rumania 
Professor Folmer D. Smith, Buoy, Tregde, Norway 
Ph van Riessen, Cornelis Jostraat 115, Scheveningen, Netherlands 
Mr. Bruno J. Vildosola, Sub-departmento de Bioestadistica, Casilla 
3979, Santiago, Chile 
Mr. William Weiss, Box 232, R.F.D. 4, Vienna, Virginia, U.S.A. 
Professor Max A. Woodbury, College of Engineering, New York Uni- 
versity, New York 53, N. Y., U.S.A. 
New Members 
At-Large 
Professor Dr. Alexander Alexandrovich Lubischew, Uljianovsk, Krasno- 
armeiskaya St. 2, K 4, U.S.S.R. 
Mr. G. E. Hodnett, Regional Research Centre, The Imperial College 
of Tropical Agriculture, St. Augustine, Trinidad, B.W.I. 
Professor Dr. Maximo Valentinuzzi, Calle Gascon 520, Buenos Aires, 
Suc. 13, Argentina, South America 


Australasian 


Dr. A. H. Carter, 88D Peachgrove Road, Hamilton, New Zealand 

Mr. M. L. Dudzinski, C.S.I.R.O., Box 109 City, Canberra, A.C.T 
Australia 

Mr. C. R. Heathcote, Statisties Department, University of Melbourne, 
Carlton N.3, Victoria, Australia 


Professor A. L. Rae, Massey College, Palmerston North, New Zealand 


bt 
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Brazilian 


Dr. Alfredo C. Nascimento Filho, Caixa Postal 25, Rio de Janeiro, 
Brazil 

Dr. Eneas Salati, Escola Superior de Agricultura, Piracicaba, Estado 
de Sao Paulo, Brazil 


Belgian 


Marcel J. W. Luttgens, Uangambi, B. P. 37, Belgian Congo 

British 

I. W. Bodmer, Dept. of Genetics, 44 Storey’s Way, Cambridge, Great 
Britain 

Dr. C. Daly, Glaxo Laboratories, North Lonsdale Rd., Ulverston, 
Lancashire, England 

J. 5S. Gale, Dept. of Genetics, 44 Storey’s Way, Cambridge, Great 
Britain 

Dr. D. Lindley, Statistical Laboratory, St. Andrews Hill, Cambridge, 
England 

Dr. C. C. Spicer, Central Public Health Laboratory, Collindale Avenue, 
London N.W. 9, England 


ENAR 


W. P. Cortelyou, Chairman, Department of Chemistry, Roosevelt 
University, Chicago 5, Illinois, U.S.A. 

Charles IF. Federspiel, Dept. of Biostatistics, University of North 
Carolina, Drawer 229, Chapel Hill, N. C., U.S.A. 

Mr. James Grizzle, 11 B Davie Circle, Chapel Hill, N. C., U.S.A. 

Professor John Gurland, Statistical Laboratory, Iowa State College, 
Ames, Iowa, U.S.A. 

Dr. M. Hansen, Bureau of the Census, Washington, D.C., U.S.A. 

Mr. Arthur G. Itkin, 925 Jersey Avenue, Elizabeth, New Jersey, U.S.A. 

Dr. Benjamin Pasamanick, Research Division, Columbus Receiving 
Hospital, O.S.U. Health Centre, Columbus 10, Ohio, U.S.A. 

Mrs. Mary E. Ready, Route No. 5, Frederick, Maryland, U.S.A. 

Dr. Richard W. Roberts, 837 First Street, Rothschild, Wisconsin, 
U.S.A. 

Dr. A. Sprott, 167 Glen Road, Toronto, Canada 

Mrs. Hanna D. Sylwestrowica, 153 Park Avenue, Madison, N. J. 


U.S:A; 


WNAR 
Stanley R. Hill, Metabolic Lab., College of Osteopathic Physicians 
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and Surgeons, 1721 Griffin Avenue, Los Angeles 31, California, U.S.A. 


French 


M. René Chouchan, Ingénieur en chef du Service Statistique, Com- 
pagnie Frangaise des Métaux, Givet, (Ardennes) France 

Monsieur Maurice Fautrel, 8 Avenue Alphand, Paris 16, France 

M. Guy Roberty, Institute d’ Enseignement et de Recherches Tropicales, 
Route d’Aulnay, Bondy, (Seine), France 


German 


Prof. Dr. Ernst Assmann, Waldecker Hohe 1303, (13b) MIESBACH/ 
Oberhayern, Germany 

Dr. Detlev Bruning, 19 Stendal, Tangermunderstr. 3b, Fernruf 684, 
Germany 

Gunther Caroli, Freiburg Br., Bertholdstr. 17, Germany 

Dr. Heinz Fink, Ludwigshafen am Rhein, Pfalxfrafenstr. 46, W. 
Germany 

Dr. Gerhard Specht, Kleinmachnow b, Potsdam, Philipp-Muller-Allee 
42, Germany 

Dr. Friedrich Wasserfall, A7el, Kronshagener Weg 32, Germany 

Dr. Franz Weiling, Bonn, Rhein, Lengsdorf, Kapellenstrasse 65, West 
Germany 


Italian 


Dr. E. Robotti, Villaggio Sanatoriale de Soudale, (prov. Soudrio) Italy 
Gian Tommaso Scarascia, Viale Mazzini 13, Rome, Italy 


Netherlands 


Dr. J. A. H. Gooszen, Dolderseweg 158A, Den Dolder, Netherlands 

Dr. Franz Adolf Nelemans, Cornelis Houtmanstraat 18, Utrecht, 
Netherlands 

Mr. H. A. Tas, Banstraat 59b, Amsterdam 21, Holland 

Jan 'T. N. Venekamp, Eelderwelde 13, Groningen, Post Eelde, Nether- 
lands 

J. C. A. Zaat, Wageningen, Dr. Wiemayer Str. 6, Holland 


Swiss 


Dr. Christian Auer, Lurlibadstr. 115, Chur, Switzerland 

Dr. Edgar Grasemann, Institut fur Haustierernahrung des Eidg, 
Technischen Hochschule, Universitatstr. 12, Zurich, Switzerland 

Prof. Dr. Hans Lortscher, Animal Breeding Institute, Swiss Federal 
Institute of Technology, Universitatstr. 2, Zurich, Switzerland 

Martin Menzi, Morgensalstrasse 21, Zurich 38, Switzerland 
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THE BIOMETRIC SOCIETY 
SECRETARY’s ImpresT AccouNtT 


Statement of Income and Expenditure during the period ended 31st Dec., 1956. 


Income 25 Rl, as 8. d. 
Treasurer—$600 Imprest 213 4 M11 
Membership Subscriptions (2) 3.3 8 
216 8 G 
less Expenditure 
Office Furniture apy (he 
Stationery ie Ole ( 
Secretarial assistance 31 5 — 
Postages A 3eet 0 
Sundries on AES 119 10 6 
Balance in Hand—31.12.56. £96 18 1 


I certify the above to be a true record of my transactions on behalf of the Biometric 
Society. 
12th March, 1957 M. J. R. Healy (Signed) 
Secretary 


I have examined the account book and vouchers produced by the Secretary and 


certify that the above statement is in accordance therewith. 


12th March, 1957 E. Church (Signed) 
(BE. Church) A.A.C.C.A. 


TREASURER’S Rpport, 1956 


Balance Sheet 


ASSETS 
Cash—$10,031.18 less $7,441.22 $2,589.96 
(Bank balance $2,554.96 
plus Petty Cash $ 35.00) 
LIABILITIES 
Subscriptions, 1957 $ 29.75 
Dues, 1957 20.25 $ 50.00 
Surplus, 1/1/56 $1,956.89 
Gain for Period 583.07 2,539.96 


$2,589.96 
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Income and Expenditure Statement 


INCOME 
Subscriptions—1955 $ 473.50 
1956 4,178.50 $4,652.00 
Dues —1955 $ 137.25 
1956 1,973.75 20 
Sustaining Memberships, 1956 455.00 
Back dues and subscriptions 105.75 
Regional allotments $ 78.50 
BIOMETRICS allotments from 
Sustaining Members 175.00 
Back issues 247.50 
Reprints and directories 3.40 
Journal of A.S.A. for Biometric Society 
Members 115.00 
Striplist 10.00 
Sale of furniture and equipment 219.00 
Consular fee 3.00 
Bank charges 325 
Overpayments 72.17 
$ 923.82 
Less credits and allotments used 214.28 709 . 54 
Total Income $8 , 033.29 
EXPENDITURE 
Secretary’s office $ 609.00 
Subscription to ASSOCIATIONS 5.00 
Subscriptions to J.A.8.A. for Society 
Members 120.00 
Shipping and trucking charges 110.33 
Consular fee 3.00 
Bank charges 58 
Miscellaneous .60 
BIOMETRICS 4,897.11 
Special services 78.50 
Postage 95.85 
Stationery and supplies 72,83 
Telephone 85.83 
Printing 70.93 
Salaries 1,300.66 
Total Expenditures $7,450.22 
Excess of Income over Expenditures 583.07 
$8 , 033.29 


Audited: Charles A. Smith 
Date: January 14th, 1957. 
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Operations Statement of Biometrics Volume 12 (1956) 


IncoME (1 February 1957) 


Subscribers 
Biometric Society 
557 $ 4 
734 2 
7 Sustaining 25 
493 ASA 4 
759 Direct t. 


Sale of back issues 
Sale of reprints 
Exchange 


EXPENDITURES (1 February 1957) 
Cost of Journals 


Printing 
Issue No. 1 $ 1,526. 
Issue No. 2 2,190. 
Issue No. 3 1,946. 
Issue No. 4 Be cOuie 
Mailing and Express Charges 
Issue No. 1 134. 
Issue No. 2 163. 
Issue No. 3 143. 
Issue No. 4 221. 
Cost of Reprints 
Printing 
Issue No. 1 163. 
Issue No. 2 324. 
Issue No. 3 169. 
Issue No. 4 343. 
Mailing Charges 
Issue No. 1 ih se 
Issue No. 2 Pal{ 
Issue No. 3 15. 
Issue No. 4 26. 


.00 
75 
00 
00 


00 


$ 2,228.00 
2,018.50 
175.00 


$ 8,921.09 


663.19 


1,000.96 


81.31 


255 


$ 4,421.50 
1,972.00 
5,313.00 
2,495.97 

692.94 
ai 


$14,896.28 


9,584.28 


1,082.27 
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OpprRATING EXPENSES 
Stamps 
Stationery 
Duplication Work 
Telephone 
Insurance 
Customs 
Bank Charges 
Exchange—Transfer of cheques 
Joseph Ruzicka, Book Binding 
Express Charges 


Income 
Expenditure 


Surplus 


Balance Sheet Biometrics Volume 12 


Assets (1 February 1957) 
Accounts Receivable 
Bank Balance 
US. $ 7,491.05 
Canadian 1,085.33 
U.S. Treasury Bond 


LIABriLitTies (1 February 1957) 


~ Subseriptions to Volume 13 
Balance from Previous Volumes 
Surplus from Volume twelve— 

Including Accounts Receivable 


Audited: 
A. W. Quealy 


NOTE: 


Not included in Assets is Stock of back issues from Volume 1-12 and Reprints. 


$ 330.00 
133.43 
52.02 
3.75 
136.03 
5.63 
15.82 
22.25 
19.35 

28.52 $ 746.80 

$11,413.35 

14,896.28 

11,413.35 

$ 3,482.93 
$ 590.16 
8,576.38 
5,000.00 

$ 1,491.00 

8,602.45 

4,073.09 

$14,166.54 $14,166.54 


NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary 
(af members at large to the General Secretary) news of appointments, 
distinctions or retirements and announcements of professional interest. 


D. R. Cox of the University of North Carolina has been appointed 
to a Readership in Statistics, University of London at Birbeck College. 


Jack Moshman has left Bell Telephone Laboratories, Inc., where 
he was consulting statistician to assume the post of Director of the 
Division of Mathematical and Statistical Services of the Council for 
Economic and Industry Research, Inc. 


D. E. W. Schumann, Head, Department of Statistics, University of 
Stellenbosch, South Africa and R. A. Bradley, Professor of Statistics, 
Virginia Polytechnic Institute, Blacksburg, Virginia, U.S.A. were 
co-authors of a paper entitled ‘“The comparison of the sensitivities 
of similar experiments’’, which won the J. Shelton Horsley Research 
Award of the Virginia Academy of Science at the May meeting at Old 
Point Comfort, Virginia. The J. Shelton Horsley Research Award is 
awarded annually to the best paper submitted in the competition in 
Virginia. 

M. C.K. Tweedie has accepted a position as Temporary Lecturer in 
Mathematical Statistics at the University of Manchester, England. 
He has been on the Faculty of the Virginia Polytechnic Institute, 
Blacksburg, Virginia, for the past four years. 


Novices 
Department of Statistics, University of Chicago 
The department of statistics at the University of Chicago, known 
since its organization in 1949 as the Committee on Statistics, is now 
called the Department of Statistics. The name was changed from 
Committee to Department in order to avoid confusion about the nature 
and status of the organization. 


Leonard J. Savage, who has been Acting Chairman of the Department 
this year, has accepted a regular appointment as Chairman beginning 
March 1, 1957. He succeeds W. Allen Wallis, who now is Dean of the 
School of Business though he continues as a member of this Department. 
Other members of the Statistics faculty are K. A. Brownlee, Kai Lai 
Chung, Sudhish G. Ghurye, Leo A. Goodman, William Kruskal, John 
W. Pratt, Harry V. Roberts, and David L. Wallace. 
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Research Center, General Foods Corporation 


Central Laboratories, General Foods Corporation, 1125 Hudson 
Street, Hoboken, New Jersey, is changing its name and address to 
Research Center, General Foods Corporation, 555 South Broadway, 
Tarrytown, New York, effective July 1, 1957. 


INTERNATIONAL CONGRESS OF MATHEMATICIANS 1958 


At the invitation of the City and University of Edinburgh and 
the Royal Society of London, the International Congress of Mathe- 
maticians will meet in Edinburgh from August 14 to August 21, 1958. 
His Royal Highness the Duke of Edinburgh has graciously consented 
to extend his patronage to the Congress. 

The Executive Committee is inviting a number of mathematicians 
to deliver one-hour and half-hour addresses. There will also be daily 
sessions devoted to fifteen-minute communications. 

There will be eight sections, namely: 1. Logic and Foundations. 
2. Algebra and the Theory of Numbers. 3. Analysis. 4. Topology. 
5. Geometry. 6. Probability and Statistics. 7. Applied Mathematics, 
Mathematical Physics and Numerical Analysis. 8. History and Educa- 
tion. A program of entertainments and excursions is being planned. 

Those who wish to receive further information about the Congress 
may write to Frank Smithies, Secretary of the International Congress 
of Mathematicians, Mathematical Institute, 16 Chambers Street, 
Edinburgh, 1, Scotland. 


XVtH INTERNATIONAL CONGRESS OF ZooLOGY-LoNpDoN, 1958 


The XVth International Congress of Zoology will take place in 
London from 16th-23rd July, 1958, under the Presidency of Sir Gavin 
de Beer, F.R.S., Director of the British Museum (Natural History) 
assisted by an Advisory Committee comprising all the leading British 
zoologists. 

A number of special topics and sessions are being arranged which, 
while covering a wide field, will centre around Evolution. Congress 
will be organized into 12 Sections, and will be preceded by a Colloquium 
on the Rules of Zoological Nomenclature. 

At the Inaugural Meeting in the Royal Albert Hall, Dr. Julian 
Huxley will give a special Darwin-Wallace Centenary address. At the 
concluding meeting, Professor J. Millot. will speak on Coelacanths. 

Further information may be obtained from the Registrar of the 


XVth International Congress of Zoology, c/o British Museum (Natural 
History), London, 8.W. 7, England. 


